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This invention relates to the field of genetic engineering and more particularly to the insertion of genes 
for the protein apoaequorin into recombinant DNA vectors and to the production of apoaequorin in recipient 
strains of microorganisms. 

Apoaequorin is a single polypeptide chain protein which can be isolated from the luminous jellyfish 

5 Aequorea victoria. When this protein contains one molecule of coelenterate luciferin bound non-covalently 
to it, it is known as aequorin. Aequorin is oxidized in the presence of calcium ions, to produce visible light. 
Once light is produced the spent protein (apoaequorin) can be purified from the oxidized luciferin and 
subsequently recharged using natural or synthetic luciferin under appropriate conditions. The addition of 
calcium ions to the recharged aequorin will again result in the production of light. Apoaequorin can therefore 

w be used in various chemical and biochemical assays as a marker. 

Natural apoaequorin is not a single compound but rather represents a mixture of molecular species. 
When pure natural aequorin, representing that of many thousands of individual Aequorea, is subjected to 
electrophoresis [O. Gabriel, Methods Enzymol. 22, 565-578 (1971)] in alkaline buffers under non- 
denaturing conditions, including 0.1 mM EDTA in aiTbuffers, at least six distinct bands of blue luminescence 

75 are visible when the gel (0.5 cm x 10 cm) is immersed in 0.1 M CaCfe. This observation agrees wtih that of 
J.R. Blinks and G.C. Harres [Fed. Proa, 34, 474 (1975)] who observed as many as twelve luminescent 
bands after the isoelectric focusing of a similar extract. Blinks and Harres observed more species because 
isoelectric focusing is capable of higher resolution than is electrophoresis. However, none of the bands was 
ever isolated as a pure peptide. 

20 Furthermore, it is difficult to produce sufficient aequorin or apoaequorin from jellyfish or other natural 
sources to provide the amounts necessary for use in bioluminescence assays. Accordingly, an improved 
means for producing apoaequorin in sufficient quantities for commercial utilization is greatly needed. 

Recently developed techniques have made it possible to employ microorganisms, capable of rapid and 
abundant growth, for the synthesis of commercially useful proteins and peptides, regardless of their source 

25 in nature. These techniques make it possible to genetically endow a suitable microorganism with the ability 
to synthesize a protein or peptide normally made by another organism. The technique makes use of a 
fundamental relationship which exists in all living organisms between the genetic material, usually DNA, and 
the proteins synthesized by the organism. This relationship is such that the amino acid sequence of the 
protein is reflected in the nucleotide sequence of the DNA. There are one or more trinucleotide sequence 

30 groups specifically related to each of the twenty amino acids most commonly occuring in proteins. The 
specific relationship between each given trinucleotide sequence and its corresponding amino acid con- 
stitutes the genetic code. The genetic code is believed to be the same or similar for all living organisms. As 
a consequence, the amino acid sequence of every protein or peptide is reflected by a corresponding 
nucleotide sequence, according to a well understood relationship. Furthermore, this sequence of nucleotides 

35 can, in principle, be translated by any living organism. 
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TABLE 1 





GENETIC CODE 


5 


Phenylalanine(Phe) 


TTK 


Histidine(His) 


CAK 




Leucine(Leu) 


XTY 


Glutamine(Gln) 


CAJ 




Isoieucine(lle) 


ATM 


Asparagine(Asn) 


AAK 




Methionine(Met) 


ATG 


Lysine(Lys) 


AAJ 




Valine(Val) 


GTL 


Aspartic acid(Asp) 


GAK 


10 


Serine(Ser) 


QRS 


Glutamic acid(Glu) 


GAJ 




Proline(Pro) 


CCL 


Cysteine(Cys) 


TGK 




Threonine(Thr) 


ACL 


Tryptophan(Try) 


TGG 




Alanine(Ala) 


GCL 


Arginine(Arg) 


WGZ 




Tyrosine(Tyr) 


TAK 


Glycine(Gly) 


GGL 


15 


Termination signal 
Termination signal 


TAJ 
TGA 
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Key: Each 3 - letter triplet represents a trinucleotide of DNA having a 5' end on the left and a 3* 
the right. The letters stand for the purine or pyrimidine bases forming the nucleotide sequence. 



A 




adenine 


G 




guanine 


C 




cytosine 


J 




A or G 


K 




TorC 


L 




A, T, C or G 


M 




A, C or T 


T 




Thymine 


X 




T or C if Y is A or G 


X 




C if Y is C or T 


Y 




A, G, C or T if X is C 


Y 




A or G if X is T 


W 




C or A if Z is C or T 


W 




C if Z is C or T 


Z 




A, G, C or T if W is G 


z 




A or G if W is A 



QR = TC if S is A, G, C or T 
QR = A, G if S is T or C 
S = A, G p C or T if QR is TC 
S = T or C if QR is AG 
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The trinucleotides of Table 1 , termed codons, are presented as DNA trinucleotides, as they exist in the 
genetic material of a living organism. Expression of these codons in protein synthesis requires intermediate 
formation of messenger RNA (mRNA), as described more fully, infra. The mRNA codons have the same 
sequences as the DNA codons of Table 1 , except that uracil is found in place of thymine. Complementary 
trinucleotide DNA sequences having opposite strand polarity are functionally equivalent to the condons of 
Table 1, as is understood in the art. An important and well known feature of the genetic code is its 
redundancy, whereby, for most of the amino acids used to make proteins, more than one coding nucleotide 
triplet may be employed. Therefore, a number of different nucleotide sequences may code for a given 
amino acid sequence. Such nucleotide sequences are considered functionally equivalent since they can 
result in the production of the same amino acid sequence in all organisms, although certain strains may 
translate some sequences more efficiently than they do others. Occasionally, a methylated variant of a 
purine or pyrimidine may be found in a given nucleotide sequence. Such methylations do not affect the 
coding relationship in any way. 

In its basic outline, a method of endowing a microorganism with the ability to synthesize a new protein 
involves three general steps: (1) isolation and purification (or chemical synthesis) of the specific gene or 
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nucleotide sequence containing the genetically coded information for the amino acid sequence of the 
desired protein, (2) recombination of the isolated nucleotide sequence with an appropriate vector, typically 
the DNA of a bacteriophage or plasmid, and (3) transfer of the vector to the appropriate microorganism and 
selection of a strain of the recipient microorganism containing the desired genetic information. 

5 A fundamental difficulty encountered in attempts to exploit commercially the above -described process 
lies in the first step, the isolation and purification of the desired specific genetic information. DNA exists in 
all living cells in the form of extremely high molecular weight chains of nucleotides. A cell may contain 
more than 10,000 structural genes, coding for the amino acid sequences of over 10,000 specific proteins, 
each gene having a sequence many hundreds of nucleotides in length. For the most part, four different 

10 nucleotide bases make up all the existing sequences. These are adenine (A), guanine (G), cytosine (c), and 
thymine (T). The long sequences comprising the structural genes of specific proteins are consequently very 
similar in overall chemical composition and physical properties. The separation of one such sequence from 
the plethora of other sequences present in isolated DNA cannot ordinarily be accomplished by conventional 
physical and chemical preparative methods. 

75 Two general methods have been used in the prior art to accomplish step (1) in the above - described 
general procedure. The first method is sometimes referred to as the shotgun technique. The DNA of an 
organism is fragmented into segments generally longer than the desired nucleotide sequence. Step (1) of 
the above -described process is essentially by -passed. The DNA fragments are immediately recombined 
with the desired vector, without prior purification of specific sequences. Optionally, a crude fractionation 

20 step may be interposed. The selection techniques of microbial genetics are relied upon to select, from 
among all the possibilities, a strain of microorganism containing the desired genetic information. The 
shotgun procedure suffers from two major disadvantages. More importantly, the procedure can result in the 
transfer of hundreds of unknown genes into recipient microorganisms, so that during the experiment, new 
strains are created, having unknown genetic capabilities. Therefore, the use of such a procedure could 

25 create a hazard for laboratory workers and for the environment. A second disadvantage of the shotgun 
method is that it is extremely inefficient for the production of the desired strain, and is dependent upon the 
use of a selection technique having sufficient resolution to compensate for the lack of fractionation in the 
first step. However, methods of overcoming these disadvantages exist, as will become apparent in later 
sections of this application. 

30 The second general method takes advantage of the fact that the total genetic information in a cell is 
seldom, if ever, expressed at any given time. In particular, the differentiated tissues of higher organisms 
may be synthesizing only a minor portion of the proteins which the organism is capable of making at any 
one time. In extreme cases, such cells may be synthesizing predominantly one protein. In such extreme 
cases, it has been possible to isolate the nucleotide sequence coding for the protein in question by isolating 

35 the corresponding messenger RNA from the appropriate cells. 

Messenger RNA functions in the process of converting the nucleotide sequence information of DNA into 
the amino acid sequence structure of a protein. In the first step of this process, termed transcription, a local 
segment of DNA having a nucleotide sequence which specifies a protein to be made, is copied into RNA. 
RNA is a polynucleotide similar to DNA except that ribose is substituted for deoxyribose and uracil is used 

40 in place of thymine. The nucleotide bases in RNA are capable of entering into the same kind of base 
pairing relationships that are known to exist between the complementary strands of DNA. A and U (T) are 
complementary, and G and C are complementary. The RNA transcript of a DNA nucleotide sequence will 
be complementary to the copied sequence. Such RNA is termed messenger RNA (mRNA) because of its 
status as intermediary between the genetic apparatus of the cell and its protein synthesizing apparatus. 

45 Generally, the only mRNA sequences present in the cell at any given time are those which correspond to 
proteins being actively syunthesized at that time. Therefore, a differentiated cell whose function is devoted 
primarily to the synthesis of a single protein will contain primarily the RNA species corresponding to that 
protein. In those instances where it is feasible, the isolation and purification of the appropriate nucleotide 
sequence coding for a given protein can be accomplished by taking advantage of the specialized synthesis 

so of such protein in differentiated cells. 

A major disadvantage of the foregoing procedure is that it is applicable only in the relatively rare 
instances where cells can be found engaged in synthesizing primarily a single protein. The majority of 
proteins of commercial intest are not synthesized in such a specialized way. The desired proteins may -be 
one of a hundred or so different proteins being produced by the cells of a tissue or organism at a given 

55 time. Nevertheless, the mRNA isolation technique is useful since the set of RNA species present in the cell 
usually represents only a fraction of the total sequences existing in the DNA, and thus provides an initial 
purification. 
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In a more recent development, U.S. Patent 4,363,877 provides a process whereby nucleotide se- 
quences can be isolated and purified even when present at a frequency as low as 2% of a heterogeneous 
population of mRNA sequences. Furthermore, the method may be combined with known methods of 
fractionating mRNA to isolate and purify sequences present in even lower frequency in the total RNA 

5 population as initially isolated. The method is generally applicable to mRNA species extracted from virtually 
any organism and therefore provides a powerful basic tool for the ultimate production of proteins of 
commercial and research interest, in useful quantities. 

The process takes advantage of certain structural features of mRNA and DNA, and makes use of 
certain enzyme catalyzed reactions. The nature of these reactions and structural details as they are 

w understood in the prior art are described herein and are further detailed in the patent. The symbols and 
abbreviations used herein are set forth in the following table. 

TABLE 2 



75 


DNA - deoxyribonucleic acid 
RNA - ribonucleic acid 


A - Adenine 
T - Thymine 


20 


cDNA - complementary DNA (enzymatically 
synthesized from an mRNA sequence) 


G - Guanine 

C - Cytosine 
U - Uracil 


25 


mRNA - messenger RNA 

dATP - deoxyadenosine triphosphate 

dGTP - deoxyguanosine triphosphate 


Tris - 2-Amino-2-hydroxymethyl -1,3 -propanediol 




dCTP - deoxycytidine triphosphate 
TCA - Trichloroacetic acid 


EDTA - ethylenediamine tetraacetic acid 




dTTP - thymidine triphosphate 




30 




ATP - adenosine triphosphate 



In its native configuration, DNA exists in the form of paired linear polynucleotide strands. The 
complementary base pairing relationships described above exist between the paired strands such that each 
nucleotide base of one strand exists opposite its complement on the other strand. The entire sequence of 
one strand is mirrored by a complementary sequence on the other strand. If the strands are separate, it is 
possible to synthesize a new partner strand, starting from the appropriate precursor monomers. The 
sequence of addition of the monomers starting from one end is determined by, and complementary to, the 
sequence of the original intact polynucleotide strand, which thus serves as a template for the synthesis of 
its complementary partner. The synthesis of mRNA corresponding to a specific nucleotide sequence of 
DNA is understood to follow the same basic principle. Therefore a specific mRNA molecule will have a 
sequence complementary to one strand of DNA and identical to the sequence of the opposite DNA strand, 
in the region transcribed. Enzymatic mechanisms exist within living cells which permit the selective 
transcription of a particular DNA segment containing the nucleotide sequence for a particular protein. 
Consequently, isolating the mRNA which contains the nucleotide sequence coding for the amino acid 
sequence of a particular protein is equivalent to the isolation of the same sequence, or gene, from the DNA 
itself. If the mRNA is retranscribed to form DNA complementary thereto (cDNA), the exact DNA sequence is 
thereby reconstituted and can, by appropriate techniques, be inserted into the genetic material of another 
organism. The two complementary versions of a given sequence are therefore inter -convertible and 
functionally equivalent to each other. 

The nucleotide subunits of DNA and RNA are linked together by phosphodiester bonds between the 5' 
position of one nucleotide sugar and the 3' position of its next neighbor. Reiteration of such linkages 
produces a linear polynucleotide which has polarity in the sense that one end can be distinguished from the 
other. The 3' end may have a free 3' -hydroxyl, or the hydroxyl may be substituted with a phosphate or a 
more complex structure. The same is true of the 5' end. In eucaryotic organisms, i.e., those having a 
defined nucleus and mitotic apparatus, the synthesis of functional mRNA usually includes the addition of 
polyadenylic acid to the 3* end of the mRNA. Messenger RNA can therefore be separated from other 
classes of RNA isolated from an eucaryotic organism by column chromatography on cellulose to which is 
attached polythymidylic acid. See Aviv, H. and Leder, P., Proc. Nat. Acad. Sci., USA, 69, 1408 (1972). Other 
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chromatographic methods, exploting the base -pairing affinity of poly A for chromatographic packing 
materials, containing oligo dT, poly U, or combinations of poly T and poly U, for example, poly U- 
Sepharose, are likewise suitable. 

Reverse transcriptase catalyzes the synthesis of DNA complelentary to an RNA template strand in the 

5 presence of the RNA template, a primer which may be any complementary oligo or polynucleotide having a 
3' - hydroxyl, and the four deoxynucleoside triphosphates, dATP, dGTP, dCTP, and dTTP. The reaction is 
initiated by the non-covalent association of the oligodeoxynucleotide primer near the 3' end of mRNA 
followed by stepwise addition of the appropriate deoxynucleotides, as determined by base -pairing 
relationships with the mRNA nucleotide sequence, to the 3' end of the growing chain. The product molecule 

w may be described as a hairpin structure in which the original RNA is paired by hydrogen bonding with a 
complementary strand of DNA partly folded back upon itself at one end. The DNA and RNA strands are not 
covalently joined to each other. Reverse transcriptase is also capable of catalyzing a similar reaction using 
a single -stranded DNA template, in which case the resulting product is a double - stranded DNA hairpin 
having a loop of single - stranded, DNA joining one set of ends. See Aviv, H. and Leder, P., Proc. Natl. 

15 Acad. Sci., USA 69, 1408 (1972) and Efstratiadis, A., Kafatos, F.C., Maxam, A.M., and Maniatis, T., Cell, 7, 
279 (1976). 

Restriction endonucleases are enzymes capable of hydrolyzing phosphodiester bonds in DNA, thereby 
creating a break in the continuity of the DNA strand. If the DNA is in the form of a closed loop, the loop is 
converted to a linear structure. The principal feature of a restriction enzyme is that its hydrolytic action is 

20 exerted only at a point where a specific nucleotide sequence occurs. Such a sequence is termed the 
restriction site for the restriction endonuclease. Restriction endonucleases from a variety of sources have 
been isolated and characterized in terms of the nucleotide sequence of their restriction sites. When acting 
on double - stranded DNA, some restriction endonucleases hydrolyze the phosphodiester bonds on both 
strands at the same point, producing blunt ends. Others catalyze hydrolysis of bonds separated by a few 

25 nucleotides from each other, producing free single -stranded regions at each end of the cleaved molecule. 
Such single - stranded ends are self -complementary, hence cohesive, and may be used to rejoin the 
hydrolyzed DNA. Since any DNA susceptible to cleavage by such an enzyme must contain the same 
recognition site, the same cohesive ends will be produced, so that it is possible to join heterogeneous 
sequences of DNA which have been treated with restriction endonuclease to other sequences similarly 

30 treated. See Roberts, R.J., Crit. Rev. Biochem. , 4, 123 (1976). 

It has been observed that restriction sites for a given enzyme are relatively rare and are nonuniformly 
distributed. Whether a specific restriction site exists within a given segment is a matter which must be 
empirically determined. However, there is a large and growing number of restriction endonucleases, isolated 
from a variety of sources with varied site specificity, so that there is a reasonable probability that a given 

35 segment of a thousand nucleotides will contain one or more restriction sites. 

For general background see Watson, J.D., The Molecular Biology of the Gene, 3d Ed., Benjamin, Menlo 
Park, Calif., (1976); Davidson, J.N., The Biochemistry of the Nucleic Acids, 8th Ed., Revised by Adams, 
R.L.P., Burdon, R.H., Campbell, A.M. and Smellie, R.M.S., Academic Press, New York, (1976); and 
Hayes.W., The Genetics of Bacteria and Their Viruses, Studies in The Biochemistry 2d Ed., Blackwell 

40 Scientific Pub., Oxford (1968). 

Accordingly, it is an object of this invention to provide a microorganism capable of providing useful 
quantities of apoaequorin. 

It is a further object of this invention to provide a recombinant DNA vector capable of being inserted 
into a microorganism and expressing apoaequorin. 
45 It is still another object of this invention to provide a DNA segment of defined structure that can be 
produced synthetically or isolated from natural sources and that can be used in the production of the 
desired recombinant DNA vectors. 

It is yet another object of this invention to provide a peptide that can be produced synthetically in a 
laboratory or by microorganism that will mimic the activity of natural apoaequorin. 
50 These and other objects of the invention as will hereinafter become more readily apparent have been 
accomplished by providing a homogeneous peptide selected from (1) compounds of 
(a) a first formula 
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TO 



75 



VKLTPDPNNPKWIGRHKHMPNP-LDV 
NHNGKI SLDEMVYKASDIVINNLGA 
TPEQAKRHKDAVEAFFGGAGMKYGV 
ETDWPAYIEGWKKLATDELEKYAKN 
QITLIRIWGDALFDIIDKDQNGAI'T 
LSEWKAYTKSAG I IQTSEECEETFR 
VCDIDESGQLDVDEMTRQHLGFWYT 
MDPACEKLYGGAV P-COOH 



wherein A is alanine, C is cysteine, D is aspartate, E is glutamate, F is phenylalanine, G is glycine, H is 
20 histidine, I is isoleucine, K is lysine, L is leucine, M is methionine, N is asparagine, P is proline, Q is 
glutamine, R is arginine, S is serine, T is threonine, V is valine, W is tryptophan, and Y is tyrosine, 

(b) a second formula in which P 5 is replaced by S, N 8 is replaced by D, Kn is replaced by R, D78 is 
replaced by E, Agi is replaced by E, Kas is replaced by R, D92 is replaced by C or E, Kas is replaced by 
R, Ass is replaced by S, Q101 is replaced by E, I102 is replaced by P, I107 is replaced by L, Ine is 

25 replaced by V, S127 is replaced by D, S135 is replaced by A, Tm is replaced by S, or Em is replaced 
by D in said first formula wherein subscript numbers refer to the amino acid position numbered from the 
amino terminal of said first formula, 

(c) a third formula in which from 1 to 15 amino acids are absent from either the amino terminal, the 
carboxy terminal, or both terminals of said first formula or said second formula, or 

30 (d) a fourth formula in which from 1 to 10 additional amino acids are attached sequentially to the amino 
terminal, carboxy terminal, or both terminals of said first formula or said second formula and 
(2) salts of compounds having said formulas, wherein said peptide is capable of binding coelenterate 
luciferin and emitting light in the presence of Ca 2+ . 

DNA molecules, recombinant DNA vectors, and modified microorganisms comprising a nucleotide 
35 sequence 

GTL X AAJ 2 XTY 3 ACL 4 [CCL or QRS] 5 GAK g TTK ? 

40 
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[AAK or GAK]g AAK 9 CCL 1Q [AAJ 


or WGZ] 1;L TGG 12 ATM 13 


GGL 14 


WGZ 15 


CAK 16 


AAJ 1? 


CAK 18 


ATG 19 


TTK 20 


AAK 21 


TTK 


XTY 23 


GAK 24 


GTL 25 


^26 


CAK 27 


AAK 28 


GGL 29 


AAJ 30 


ATM 


QRS 32 


XTY33 


GAK 34 


GAJ 35 


ATG 36 


GTL 37 


TAK 3B 


AAJ 39 


GCL 


QRS 41 


GAK 42 


ATM 43 


GTL 44 


ATM 45 


AAK 46 


AAK 4? 


XTY 48 


GGL 


GCL 50 


ACL 51 


CCL 52 


GAJ 53 


CAJ 54 


GCL 55 


AAJ 56 


WGZ 57 


CAK 


AAJ 59 


GAK 6Q 


GCL 61 


GTL 62 


GAJ 63 


GCL 64 


TTK 65 


TTK 66 


GGL, 


GGL 68 


GCL 69 


GGL 7Q 


ATG ?1 


AAJ ?2 


TAK ?3 


GGL ?4 


GTL 75 


GAJ. 



31 
40 
49 
58 
67 
76 
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[GAK or GAJ] ? g TGG ?9 CCLqq [GCL or GAJ] 81 TAKg 2 ATM g3 
GAJ 84 GGL Q5 TGG Q6 AAJ Q7 [AAJ or WGZ] 8Q XTY 8g GCL go 
ACL gi [GAK, GAJ, or TGK] g2 GAJ g3 XTY g4 GAJ g5 [AAJ or 
WGZ] g6 TAK g? [GCL or QRS] a!q AAJ Q<? AAK 10Q [CAJ or 
GAJ] 1Q1 [ATM or CCL] 1Q2 ACL 103 XTY 1Q4 ATM 1Q5 WGZ 106 
[ATM or XTY] i07 TGG 1Q8 GGL 1Qg GAK n0 GCL^ XTY U2 
TTK 113 GAK 114 ATM 115 [ATM or GTL] llg GAK 11? AAJ 118 
GAK lig CAJ 12Q AAK 121 GGL 122 GCL 123 ATM 124 ACL 125 XTY 126 
[QRS or GAK] 12? GAJ 128 TGG 12g AAJ 13Q GCL 131 TAK 132 
ACL 133 AAJ 134 [QRS or GCL] 135 GCL 136 GGL 13? ATM 13Q 
ATM 13g CAJ 14Q [ACL or QRS] 141 QRS 142 GAJ 143 [GAJ or 
GAK] 144 TGK 145 GAJ 146 GAJ 147 ACL 148 TTK 14g ViGZ^ GTL 151 
TGK 152 GAK 153 ATM 154 GAK 155 GAJ 156 QRS 157 GGL 158 CAJ 15g 
XTY 16Q GAK 161 GTL 162 GAK 163 GAJ 164 ATG 165 ACL 166 WGZ 16? 
CAJ 16Q CAK 16g XTY 170 GGL 171 TTK 1?2 TGG 1?3 TAl< 174 ACL 1?5 
ATG 176 GAK 177 CCL 178 GCL 17g TGK 180 GAJ 181 AAJ 182 XTY 183 
TAK 1Q4 GGL 1Q5 GGL 1Q6 GCL 1Q7 GTL 18Q CCL 189 

wherein 

A is deoxyadenyl, 
G is deoxyguanyl, 
C is deoxycytosyl, 
T is deoxythymidyl, 
J is A or G; 
K is T or C; 
L is A, T, C, or G; 
M is A, C or T; 

X is T or C, if the succeeding Y is A or G, and C if the succeeding Y is C or T; 

Y is A, G, C or T, if the preceding X is C, and A or G if the preceding X is T; 

W is C or A, if the succeeding Z is G or A, and C if the succeeding Z is C or T; 

Z is A, G, C or T, if the preceding W is C, and A or G if the preceding W is A; 

QR is TC, if the succeeding S is A, G, C or T, and AG if the succeeding S is T or C; 

S is A, G, C or T, if the preceding QR is TC, and T or C if the preceding QR is AG; and 

subscript numerals refer to the amino acid position in apoaequorin, for which the nucleotide sequence 

corresponds according to the genetic code, the amino acid positions being numbered from the amino end 

or a nucleotide sequence coding a peptide previously mentioned, are also provided for use in carrying out 

preferred aspects of the invention relating to the production of such peptides by the techniques of genetic 

engineering. 

The accompanying figures and drawings are provided to demonstrate the results obtained in the 
specific examples which illustrate the invention but are not considered to be limiting thereof. 
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FIGURE 1 is a photograph showing an autoradiographic analysis of in vitro translated proteins using 
poly(A + )RNA isolated from Aequorea jellyfish. The translation was performed in the absence (lane 1) of 
presence (lane 3) of Aequorea poly(A + )RNA. The anti-aequorin immunoprecipitated proteins from the two 
reactions were applied to lanes 2 and 4, respectively. On the right are marked the positions of the protein 

5 molecular weight standards phosphorylase b, BSA, ovalbumin, carbonic anhydrase, SBT1 and lysozyme. 
The position of native aequorin is also indicated. 

FIGURE 2(a) is a restriction map of a gene isolated from an Aequorea victoria jellyfish that contains a 
DNA sequence coding for apoaequorin, while FIGURE 2(b) is a restriction map of the segment shown in 
FIGURE 2(a) inserted into a plasmid downstream from a lac promotor. 

10 FIGURE 3 is a graph of time- and oxygen - dependent formulation of Ca 2+ - dependent photoprotein 
activity in pAEQ1 extracts. Conditions used: (a) In curves 1 and 2, 0.5 ml aliquotes of the active fractions 
were made 2 mM in & - mercaptoethanol and 0.1 mM in coelenterate luciferin and incubated at 4° for the 
times indicated. At appropriate time intervals, 5 ul aliquotes were removed and assayed for photoprotein 
activity, (b) In curve 2, dissolved O2 levels were reduced by bubbling with Ar gas and the mixture exposed 

75 to oxygen at the time indicated, (c) In curve 3, native apoaequorin was used in the incubation mixture in 
place of the pAEQ1 extract. 

FIGURE 4 is a graph of a gel filtration profile of the Ca 2+ - dependent photoprotein activity generated 
from pAEQl extracts. Partially purified apoaequorin activity from pAEQ1 extracts were used to generate 
Ca 2+ - dependent photoprotein activity as described in FIGURE 3. This photoprotein fraction (50 ill) was 

20 then placed on a G -75-40 superfine column (30.7 ml bed volume) equilibrated with 10 mM EDTA, 15 mM 
Tris, pH 7.5 and 100 mM KCI. The elution positions of various molecular weight markers are indicated. 

The present inventor has obtained for the first time a recombinant DNA vector capable of expressing 
the protein apoaequorin in a microorganism and has additionally identified for the first time the amino acid 
sequence of apoaequorin, thereby providing access to homogeneous apoaequorin. Using this information a 

25 variety of recombinant DNA vectors capable of providing homogeneous apoaequorin in reasonble quantities 
are obtained. Additional recombinant DNA vectors can be produced using standard techniques of recom- 
binant DNA technology. A transformant expressing apoaequorin has also been produced as an example of 
this technology. 

The amino acid sequence of apoaequorin is shown in TABLE 3. 
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Since there is a known and definite correspondence between amino acids in a peptide and the DNA 
50 sequence that codes for the peptide, the DNA sequence of a DNA or RNA molecule coding for apoaequorin 
(or any of the modified peptides later discussed) can readily be derived from this amino acid sequence, and 
such a sequence of nucleotides is shown in TABLE 4. 
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TABLE 4 

Nucleotide sequence of one strand of apoaequorin CNA. The numbers refer 
to the amino acid sequence and corresponding IXJA codon sequence beginning 
at the amino terminus of the protein. The DNA sequence corresponds to the 
mKEEV sequence except that U replaces T in the mRNA. 



10 



Val 
GTL 



Lys 
AAJ 



Leu 
XTY 



Ihr 
ACL 



[Pro or Ser] 
[CCL or CPS] 



75 



Asp 
GAK 



[Lys or Arg] 
[AAJ or WGZ] 



Phe 
TTK 



Try 

TGG 



[Asn or Asp] 
[AAK or GAK] 



He 
ATM 



Asn 
AAK 



Gly 
GGL 



10 

Pro 

OCL 

15 
Arg 

WGZ 



20 



His 
CAK 



Lys 
AAJ 



His 
CAK 



Met 
ATG 



20 

Rie 

TTK 



25 



Asn 
AAK 



Phe 
TTK 



Leu 
XTY 



Asp 
GAK 



25 

Val 

GTL 



30 



Asn 
AAK 



He 
ATM 



His 
CAK 



Ser 
QRS 



Asn 
AAK 



Leu 
XTY 



Gly 
GGL 



Asp 
GAK 



30 

lys 

AAJ 

35 

Glu 

GAJ 



35 



40 



45 



50 



55 
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Met 

ATG 



Val 
GTL 



TAK 



Lys 
AAJ 



40 

Ala 

GCL 



Ser 
QRS 



Asp 
GAK 



lie 
ATM 



Val 
GTL 



45 

He 

ATM 



Asn 
AAK 



Asn 
AAK 



Leu 
XTY 



Gly 
GGL 



50 

Ala 

GGL 



Ihr 
ACL 



Pro 
OCL 



Glu 
GAJ 



Gin 
CAJ 



55 

Ala 

GCL 



Lys 
AAJ 



Arg 

WGZ 



His 
CAK 



lys 
AAJ 



60 

Asp 

GAK 



Ala 
GCL 



Val 
GTL 



Glu 
GAJ 



Ala 
GCL 



65 
FVie 



Fhe 
TTK 



Gly 
GGL 



Gly 
GGL 



Ala 
GCL 



70 

Gly 

GGL 



Met 
ATG 



Lys 
AAJ 



Tyr 
TAK 



Gly 
GGL 



75 

Val 

GTL 



Glu 
GAJ 



Thr 
ACL 



[Asp or Glu] 
[GAK or GAJ] 



Try 

TGG 



80 
Pro 

CCL 



[Ala or Glu] 
[GCL or GAJ] 



Tyr 
TAK 



He 

ATM 



Glu 
GAJ 



85 

Gly 

GGL 



Try 
TGG 



Lys 
AAJ 



[Lys or Arg] 
[AAJ or WGZ] 



Leu 
XTY 



90 

Ala 

GCL 



Ihr 
ACL 



[Asp, Glu, 
or Cys] 
[GAK, GAJ, 
or TGK] 



Glu 
GAJ 



Leu 
XIY 



95 
Glu 

GAJ 
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[Lys or Arg] 
[AAJ or WGZ] 



Tyr 
TAK 



[Ala or Ser] 
[GCL or QRS] 



Lys 
AAJ 



100 
Asn 
AAK 



[Gin or Glu] 
[CAJ or GUJ] 



[lie or Pro] 
[AIM or CCL] 



Thr 
ACL 



Leu 
XTY 



105 
lie 

ATM 



10 



Arg 
WGZ 



[lie or Leu] 
[ATM or XTY] 



Try 
TGG 



Gly 
GGL 



110 
Asp 
GAK 



75 



Ala 
GCL 



[lie or Val] 
[AIM or GTL] 



Leu 
XTY 



Asp 
GAK 



Fhe 
TTK 



Lys 
AAJ 



GAK 



Asp 
GAK 



115 
He 
ATM 

120 
Gin 
CAJ 



20 



Asn 
AAK 



Gly 
GGL 



Ala 
GCL 



He 
ATM 



125 
Thr 
ACL 



25 



Leu 
XTY 



Ala 
GCL 



[Ser or Asp] 
[QRS or GAK] 



Tyr 
TAK 



Glu 
GAJ 



1hr 
ACL 



Try 
TGG 



Lys 
AAJ 



130 
Lys 
AAJ 

135 

[Ser or Ala] 
[QRS or GCL] 



30 



Ala 
GCL 



Gly 
GGL 



He 
ATM 



He 
ATM 



140 
Gin 
CAJ 



35 



[Thr or Ser] 
[ACL or QRS] 



Ser 
QRS 



Glu 

GAJ 



[Glu or Asp] 
[GAJ or GAK] 



145 

cys 

TGK 



40 



Glu 
GAJ 



Val 
GTL 



Glu 
GAJ 



Cys 

TGK 



Thr 

ACL 



Asp 
GAK 



Hie 
TTK 



He 
ATM 



150 
Arg 

WGZ 

155 
Asp 
GAK 



45 



Glu 
GAJ 



Ser 
QRS 



Gly 
GGL 



Gly 
CAJ 



160 
Leu 
XTY 



50 



55 
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Asp 
GAK 



Val 
GTL 



Asp 
GAK 



Glu 
GAJ 



165 
Mat 
ATG 



Thr 

ACL 



Arg 
WGZ 



Gin 
CAJ 



His 
CAK 



170 
Leu 



w 



Gly 
GGL 



Fhe 
TTK 



Try 

TGG 



Tyr 
TAK 



175 
Thr 
ACL 



75 



Met 
ATG 



Asp 
GAK 



Pro 
CCL 



Ala 

GCL 



180 
Cys 
TGK 



20 



Glu 
GAJ 



Lys 
AAJ 



Leu 
X3Y 



Tyr 

TAK 



185 
Gly 
GGL 



25 



Gly 
GGL 



Ala 
GCL 



Val 
GTL 



189 

Pro-OOOH 
CCL 



Since the DNA sequence of the gene has been fully identified, it is possible to produce a DNA gene 
entirely by synthetic chemistry, after which the gene can be inserted into any of the many available DNA 
vectors using known techniques of recombinant DNA technology. Thus the present invention can be carried 

30 out using reagents, plasmids, and microorganism which are freely available and in the public domain at the 
time of filing of this patent application. 

For example, nucleotide sequences greater than 100 bases long can be readily synthesized on an 
Applied Biosystems Model 380A DNA Synthesizer as evidenced by commercial advertising of the same 
(e.g., Genetic Engineering News, November/December 1994, p. 3). Such obigonucleotides can readily be 

35 spliced using, among others, the techniques described later in this application to produce any nucleotide 
sequence described herein. 

Furthermore, automated equipment is also available that makes direct synthesis of any of the peptides 
disclosed herein readily available. In the same issue of Genetic Engineering News mentioned above, a 
commercially available automated peptide synthesizer having a coupling efficiency exceeding 99% is 

40 advertised (page 34). Such equipment provides ready access to the peptides of the invention, either by 
direct synthesis or by synthesis of a series of fragments that can be coupled using other known techniques. 

In addition to the specific peptide sequences shown in Table 3, other peptides based on these 
sequences and representing minor variations thereof will have the biological activity of apoaequorin. For 
example, up to 15 amino acids can be absent from either or both terminals of the sequence given without 

45 losing luciferin and calcium binding ability. Likewise, up to 10 additional amino acids can be present at 
either or both terminals. These variations are possible because the luciferin and calcium binding sites 
involve the amino acids in the middle of the given sequences. For example, the luciferin binding site 
appears to involve amino acids 40-100. Since the terminals are relatively unimportant for biological activity, 
the identity of added amino acids is likewise unimportant and can be any of the amino acids mentioned 

so herein. 

Experimental data is available to verify that added amino acids at the amine terminal do not have a 
significant effect on bioluminscence. Nevertheless, preferred compounds are those which more closely 
approach the specific formulas given with 10 or fewer, more preferably 5 or fewer, absent amino acids 
being preferred for either terminal and 7 or fewer, more preferably 4 or fewer, additional amino acids being 
55 preferred for either terminal. 

Within the central portion of the molecule, replacement of amino acids is more restricted in order that 
biological activity can be maintained. However, all of the points of microheterogenity shown in Table 3 or 
Table 4 represent biologically functional replacements and any combination of the indicated replacements 
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will represent a functional molecule. In both Tables, the main line (Table 3) or first entry (Table 4) 
represents the more prevelant amino acid or nulceotide for that location and is preferred. 

In addition minor variations of the previously mentioned peptides and DNA molecules are also 
contemplated as being equivalent to those peptides and DNA molecules that are set forth in more detail, as 

5 will be appreciated by those skilled in the art. For example, it is reasonable to expect that an isolated 
replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a 
serine, or a similar replacement of an amino acid with a structurally related amino acid will not have a major 
effect on the biological activity of the resulting molecule, especially if the replacement does not involve an 
amino acid at a binding site. Whether a change results in a functioning peptide can readily be determined 

10 by incubating the resulting peptide with a luciferin followed by contact with calcium ions. Examples of this 
process are described later in detail. If light is emitted, the replacement is immaterial, and the molecule 
being tested is equivalent to those of Table 3. Peptides in which more than one replacement has taken 
place can readily be tested in the same manner. 

DNA molecules that code for such peptides can readily be determined from the list of codons in Table 

75 1 and are likewise contemplated as being equivalent to the DNA sequence of Table 4. In fact, since there is 
a 1:1 correspondence between DNA codons and amino acids in a peptide, any discussion in this application 
of a replacement or other change in a peptide is equally applicable to the corresponding DNA sequence or 
to the DNA molecule, recombinant vector, or transformed microorganism in which the sequence is located 
(and vice versa). 

20 In addition to the specific nucleotides listed in Table 4, DNA (or corresponding RNA) molecules of the 
invention can have additional nucleotides preceeding or following those that are specifically listed. For 
example, poly A can be added to 3' -terminal, short (e.g., fewer than 20 nucleotides) sequence can be 
added to either terminal to provide a terminal sequence corresponding to a restriction endonuclease site, 
stop codons can follow the peptide sequence to terminate transcription, and the like. Additionally, DNA 

25 molecules containing a promotor region or other control region upstream from the gene can be produced. 
All DNA molecules containing the sequences of the invention will be useful for at least one purpose since all 
can minimally be fragmented to produce oligonucleotide probes of the type described later in this 
specification. 

Peptides of the invention can be prepared for the first time as homogeneous preparations, either by 

30 direct synthesis or by using a cloned gene as described herein. By "homogeneous" is meant, when 
referring to a peptide or DNA sequence, that the primary molecular structure (i.e., the sequence of amino 
acids or nucleotides) of substantially all molecules present in the composition under consideration is 
identical. The term "substantially" as used in the preceeding sentence preferably means at least 95% by 
weight, more preferably at least 99% by weight, and most preferably at least 99.8% by weight. The 

35 presence of fragments derived from entire molecules of the homogeneous peptide or DNA sequence, if 
present in no more than 5% by weight, preferably 1 % by weight, and more preferably 0.2% by weight, is 
not to be considered in determining homogeity since the term "homogeneous" relates to the presence of 
entire molecules (and fragments thereof) have a single defined structure as opposed to mixtures (such as 
those that occur in natural apoaequorin) in which several molecules of similar molecular weight are present 

40 but which differ in their primary molecular structure. The term "isolated" as used herein refers to pure 
peptide, DNA, or RNA separated from other peptides, DNAs, or RNAs, respectively, and being found in the 
presence of (if anything) only a solvent, buffer, ion or other component normally present in a biochemical 
solution of the same. "Isolated" does not encompass either natural materials in their native state or natural 
materials that have been separated into components (e.g., in an acylamide gel) but not obtained either as 

45 pure substances or as solutions. The term "pure" as used herein preferably has the same numerical limits 
as "substantially" immediately above. The phrase "replaced by" or "replacement" as used herein does not 
necessarily refer to any action that must take place but to the peptide that exists when an indicated 
"replacement" amino acid is present in the same position as the amino acid indicated to be present in a 
different formula (e.g., when serine is present at position 5 instead of proline.) 

so Salts of any of the peptides described herein will naturally occur when such peptides are present in (or 
isolated from) aqueous solutions of various pHs. All salts of peptides having the indicated biological activity 
are considered to be within the scope of the present invention. Examples include alkali, alkaline earth, and 
other metal salts of carboxylic acid residues, acid addition salts (e.g., HCI) of amino residues, and zwitter 
ions formed by reactions between carboxylic acid and amino residues within the same molecule. 

55 The invention has specifically contemplated each and every possible variation of peptide or nucleotide 
that could be made by selecting combinations based on the possible amino acid and codon choices listed 
in Table 3 and Table 4, and all such variations are to be considered as being specifically disclosed. 
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In a preferred embodiment of the invention, genetic information encoded as mRNA is obtained from 
Aequorea jellyfish and used in the construction of a DNA gene, which is in turn used to produce a peptide 
of the invention. 

It is preferred to use a cell extract from the light emitting organs of an Aequoria jellyfish as a source of 

5 mRNA, although a whole body cell extract may be used. Typically, a jellyfish or sarts thereof is cut into 
small pieces (minced) and the pieces are ground to provide an initial crude cell suspension. The cell 
suspension is sonicated or otherwise treated to disrupt cell membranes so that a crude cell extract is 
obtained. Known techniques of biochemistry (e.g., preferential precipitation of proteins) can be used for 
initial purification if desired. The crude cell extract, or a partially purified RNA portion therefrom, is then 

10 treated to further separate the RNA. For example, crude cell extract can be layered on top of a 5 ml 
cushion of 5.7 M CsCI, 10 mM Tris-HCI, pH 7.5, 1mM EDTA in a 1 in. X 3 1/2 in. nitrocellulose tube and 
centrifuged in an SW27 rotor (Beckman Instruments Corp., Fullerton, Calif) at 27,000 rpm for 16 hr at 15° C. 
After centrifugation, the tube contents are decanted, the tube is drained, and the bottom 1/2 cm containing 
the clear RNA pellet is cut off with a razor blade. The pellets are transferred to a flask and dissolved in 20 

75 ml 10 mM Tris-HCI, pH 7.5, 1 mM EDTA, 5% sarcosyl and 5% phenol. The solution is then made 0.1 M in 
NaCI and shaken with 40 ml of a 1:1 phenohchloroform mixture. RNA is precipitated from the aqueous 
phase with ethanol in the presence of 0.2 M Na- acetate pH 5.5 and collected by centrifugation. Any other 
method of isolating RNA from a cellular source may be used instead of this method. 

Various forms of RNA may be employed such as polyadenylated, crude or partially purified messenger 

20 RNA, which may be heterogeneous in sequence and in molecular size. The selectivity of the RNA isolation 
procedure is enhanced by any method which results in an enrichment of the desired mRNA in the 
heterodisperse population of mRNA isolated. Any such prepurification method may be employed in 
preparing a gene of the present invention, provided that the method does not introduce endonucleolytic 
cleavage of the mRNA. 

25 Prepurification to enrich for desired mRNA sequences may also be carried out using conventional 
methods for fractionating RNA, after its isolation from the cell. Any technique which does not result in 
degradation of the RNA may be employed. The techniques of preparative sedimentation in a sucrose 
gradient and gel electrophoresis are especially suitable. 

The mRNA must be isolated from the source cells under conditions which preclude degradation of the 

30 mRNA. The action of RNase enzymes is particularly to be avoided because these enzymes are capable of 
hydrolytic cleavage of the RNA nucleotide sequence. A suitable method for inhibiting RNase during 
extraction from cells involves the use of 4 M guanidinium thiocyanate and 1 M mercaptoethanol during the 
cell disruption step. In addition, a low temperature and a pH near 5.0 are helpful in further reducing RNase 
degradation of the isolated RNA. 

35 Generally, mRNA is prepared essentially free of contaminating protein, DNA, polysaccharides and 
lipids. Standard methods are well known in the art for accomplishing such purification. RNA thus isolated 
contains non - messenger as well as messenger RNA. A convenient method for separating the mRNA of 
eucaryotes is chromatography on columns of oligo-dT cellulose, or other oligonucleotide -substituted 
column material such as poly U-Sepharose, taking advantage of the hydrogen bonding specificity 

40 conferred by the presence of polyadenylic acid on the 3' end of eucaryotic mRNA. 

The next step in most methods is the formation of DNA complementary to the isolated heterogeneous 
sequences of mRNA. The enzyme of choice for this reaction is reverse transcriptase, although in principle 
any enzyme capable of forming a faithful complementary DNA copy of the mRNA template could be used. 
The reaction may be carried out under conditions described in the prior art, using mRNA as a template and 

45 a mixture of the four deoxynucleoside triphosphates dATP, dGTP, dCTP and dTTP, as precursors for the 
DNA strand. It is convenient to provide that one of the deoxynucleoside triphosphates be labeled with a 
radioisotope, for example 32 P in the alpha position, in order to monitor the course of the reaction, to provide 
a tag for recovering the product after separation procedures such as chromatography and electrophoresis, 
and for the purpose of making quantitative estimates of recovery. See Efstratiadis, A., et al, supra. 

so The cDNA transcripts produced by the reverse transcriptase reaction are somewhat heterogeneous with 
respect to sequences at the 5' end and the 3' end due to variations in the initiation and termination points of 
individual transcripts, relative to the mRNA template. The variability at the 5' end is thought to be due to the 
fact that the oligo-dT primer used to initiate synthesis is capable of binding at a variety of loci along the 
polyadenylated region of the mRNA. Synthesis of the cDNA transcript begins at an indeterminate point in 

55 the poly -A region, and a variable length of poly -A region is transcribed depending on the initial binding 
site of the oligo-dT primer. It is possible to avoid this indeterminacy by the use of a primer containing, in 
addition to an oligo-dT tract, one or two nucleotides of the RNA sequence itself, thereby producing a 
primer which will have a preferred and defined binding site for initiating the transcription reaction. 
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The indeterminacy at the 3* -end of the cDNA transcript is due to a variety of factors affecting the 
reverse transcriptase reaction, and to the possibility of partial degradation of the RNA template. The 
isolation of specific cDNA transcripts of maximal length is greatly facilitated if conditions for the reverse 
transcriptase reaction are chosen which not only favor full length synthesis but also repress the synthesis of 

5 small DNA chains. Preferred reaction conditions for avian myeloblastosis virus reverse transcriptase are 
given in the examples section of U.S. Patent 4,363,877 and are herein incorporated by reference. The 
specific parameters which may be varied to provide maximal production of long -chain DNA transcripts of 
high fidelity are reaction temperature, salt concentration, amount of enzyme, concentration of primer relative 
to template, and reaction time. 

w The conditions of temperature and salt concentration are chosen so as to optimize specific base- 
pairing between the oligo - dT primer and the polyadenylated portion of the RNA template. Under properly 
chosen conditions, the primer will be able to bind at the polyadenylated region of the RNA template, but 
non-specific initiation due to primer binding at other locations on the template, such as short, A-rich 
sequences, will be substantially prevented. The effects of temperature and salt are interdependent. Higher 

75 temperatures and low salt concentrations decrease the stability of specific base - pairing interactions. The 
reaction time is kept as short as possible, in order to prevent non - specific initiations and to minimize the 
opportunity for degradation. Reaction times are interrelated with temperature, lower temperatures requiring 
longer reaction times. At 42 *C, reactions ranging from 1 min. to 10 minutes are suitable. The primer should 
be present in 50 to 500 -fold molar excess over the RNA template and the enzyme should be present in 

20 similar molar excess over the RNA template. The use of excess enzyme and primer enhances initiation and 
cDNA chain growth so that long -chain cDNA transcripts are produced efficiently within the confines of the 
sort incubation times. 

In many cases, it will be possible to further purify the cDNA using single -stranded cDNA sequences 
transcribed from mRNA. However, as discussed below, there may be instances in which the desired 

25 restriction enzyme is one which acts only on double - stranded DNA. In these cases, the cDNA prepared as 
described above may be used as a template for the synthesis of double - stranded DNA, using a DNA 
polymerase such as reverse transcriptase and a nuclease capable of hydrolyzing single -stranded DNA. 
Methods for preparing double - stranded DNA in this manner have been described in the prior art. See, for 
example, Ullrich, A., Shine, J., Chirgwin, J., Pictet, R., Tischer, E., Rutter, W.J. and Goodman, H.M., Science 

30 196, 1313 (1977). If desired, the cDNA can be purified further by the process of U.S. Patent 4,363,877, 
although this is not essential. In this method, heterogeneous cDNA, prepared by transcription of heteroge- 
neous mRNA sequences, is treated with one or two restriction endonucleases. The choice of endonuclease 
to be used depends in the first instance upon a prior determination that recognition sites for the enzyme 
exist in the sequence of the cDNA to be isolated. The method depends upon the existence of two such 

35 sites. If the sites are identical, a single enzyme will be sufficient. The desired sequence will be cleaved at 
both sites, eliminating size heterogeneity as far as the desired cDNA sequence is concerned, and creating a 
population of molecules, termed fragments, containing the desired sequence and homogeneous in length. If 
the restriction sites are different, two enzymes will be required in order to produce the desired homo- 
geneous length fragments. 

40 The choice of restriction enzyme(s) capable of producing an optimal length nucleotide sequence 
fragment coding for all or part of the desired protein must be made empirically. If the amino acid sequence 
of the desired protein is known, it is possible to compare the nucleotide sequence of uniform length 
nucleotide fragments produced by restriction endonuclease cleavage with the amino acid sequence for 
which it codes, using the known relationship of the genetic code common to all forms of life. A complete 

45 amino acid sequence for the desired protein is not necessary, however, since a reasonably accurate 
identification may be made on the basis of a partial sequence. Where the amino acid sequence of the 
desired protein is now known, the uniform length polynucleotides produced by restriction endonuclease 
cleavage may be used as probes capable of identifying the synthesis of the desired protein in an 
appropriate in vitro protein synthesizing system. Alternatively, the mRNA may be purified by affinity 

so chromatography. Other techniques which may be suggested to those skilled in the art will be appropriate for 
this purpose. 

The number of restriction enzymes suitable for use depends upon whether single - stranded or 
double - stranded cDNA is used. The preferred enzymes are those capable of acting on single -stranded 
DNA, which is the immediate reaction product of mRNA reverse transcription. The number of restriction 
55 enzymes now known to be capable of acting on single -stranded DNA is limited. The enzymes Haelll, Hhal 
and Hin(f)l are presently known to be suitable. In addition, the enzyme Mboll may act on single -stranded 
DNA. Where further study reveals that other restriction enzymes can act on single - stranded DNA, such 
other enzymes may appropriately be included in the list of preferred enzymes. Additional suitable enzymes 
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include those specified for double - stranded cDNA. Such enzymes are not preferred since additional 
reactions are required in order to produce double - stranded cDNA, providing increased opportunities for the 
loss of longer sequences and for other losses due to incomplete recovery. The use of double - stranded 
cDNA presents the additional technical disadvantage that subsequent sequence analysis is more complex 

5 and laborious. For these reasons, single - stranded cDNA is prefered, but the use of double -stranded DNA 
is feasible. In fact, the present invention was initially reduced to practice using double - stranded cDNA 

The cDNA prepared for restriction endonuclease treatment may be radioactively labeled so that it may 
be detected after subsequent separation steps. A preferred technique is to incorporate a radioactive label 
such as 32 P in the alpha position of one of the four deoxynucleoside triphosphate precursors. Highest 

10 activity is obtained when the concentration of radioactive precursor is high relative to the concentration of 
the non - radioactive form. However, the total concentration of any deoxynucleoside triphosphate should be 
greater than 30 liM, in order to maximize the length of cDNA obtained in the reverse transcriptase reaction. 
See Efstratiadis, A., Maniatis, T., Kafatos, F.C., Jeffrey, A., and Vournakis, J.N., Cell 4, 367 (1975). For the 
purpose of determining the nucleotide sequence of cDNA, the 5* ends may be conveniently labeled with 

75 in a reaction catalyzed by the enzyme polynucleotide kinase. See Maxam, A.M. and Gilbert, W., Proc. Natl. 
Acad. Sci., USA 74, 560 (1 977). ~~~~~ 
Fragments which have been produced by the action of a restriction enzyme or combination of two 
restriction enzymes may be separated from each other and from heterodisperse sequences lacking 
recognition sites by any appropriate technique capable of separating polynucleotides on the basis of 

20 differenes in length. Such methods include a variety of electrophoretic techniques and sedimentation 
techniques using an ultracentrifuge. Gel electrophoresis is preferred because it provides the best resolution 
on the basis of polynucleotide length. In addition, the method readily permits quantitative recovery of 
separated materials. Convenient gel electrophoresis methods have been described by Dingman, C.W., and 
Peacock, A.C., Biochemistry , 7, 659 (1968), and by Maniatis, T., Jeffrey, A. and van de Sande, H., 

25 Biochemistry , 14, 3787 (1 975). 

Prior to restriction endonuclease treatment, cDNA transcripts obtained from most sources will be found 
to be heterodisperse in length. By the action of a properly chosen restriction endonuclease, or pair of 
endonucleases, polynucleotide chains containing the desired sequence will be cleaved at the respective 
restriction sites to yield polynucleotide fragments of uniform length. Upon gel electrophoresis, these will be 

30 observed to form a distinct band. Depending on the presence or absence of restriction sites on other 
sequences, other discrete bands may be formed as well, which will most likely be of different length than 
that of the desired sequence. Therefore, as a consequence of restriction endonuclease action, the gel 
electrophoresis pattern will reveal the appearance of one or more discrete bands, while the remainder of the 
cDNA will continue to be heterodisperse. In the case where the desired cDNA sequence comprises the 

35 major polynucleotide species present, the electrophoresis pattern will reveal that most of the cDNA is 
present in the discrete band. 

Although it is unlikely that two different sequences will be cleaved by restriction enzymes to yield 
fragments of essentially similar length, a method for determining the purity of the defined length fragments 
is desirable. Sequence analysis of the electrophoresis band may be used to detect impurities representing 

40 10% or more of the material in the band. A method for detecting lower levels of impurities has been 
developed founded upon the same general principles applied in the initial isolation method. The method 
requires that the desired nucleotide sequence fragment contain a recognition site for a restriction en- 
donuclease not employed in the initial isolation. Treatment of polynucleotide material, eluted from a gel 
electrophoresis band, with a restriction endonuclease capable of acting internally upon the desired 

45 sequence will result in cleavage of the desired sequence into two sub -fragments, most probably of 
unequal length. These sub -fragments upon electrophoresis will form two discrete bands at positions 
corresponding to their respective lengths, the sum of which will equal the length of the polynucleotide prior 
to cleavage. Contaminants in the original band that are not susceptible to the restriction enzyme may be 
expected to migrate to the original position. Contaminants containing one or more recognition sites for the 

50 enzyme may be expected to yield two or more sub -fragments. Since the distribution of recognition sites is 
believed to be essentially random, the probability that a contaminant will also yield sub - fragments of the 
same size as those of the fragment of desired sequence is extremely low. The amount of material present 
in any band of radioactively labeled polynucleotide can be determined by quantitative measurement of the 
amount of radioactivity present in each band, or by any other appropriate method. A quantitative measure of 

55 the purity of the fragments of desired sequence can be obtained by comparing the relative amounts of 
material present in those bands representing sub-fgragments of the desired sequence with the total 
amount of material. 
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Following the foregoing separation or any other technique that isolates the desired gene, the sequence 
may be reconstituted. The enzyme DNA ligase, which catalyzes the end -to -end joining of DNA 
fragments, may be employed for this purpose. The gel electrophoresis bands representing the sub- 
fragments of the desired sequence may be separately eluted and combined in the presence of DNA ligase, 

5 under the appropriate conditions. See Sgaramella, V., Van de Sande, J.H., and Khorana, H.G., Proc. Natl. 
Acad. Sci., USA 67, 1468 (1970). Where the sequences to be joined are not blunt- ended, the ligase 
obtained from E. cofi may be used, Modrich, P., and Lehman, I.R., J. Biol. Chem. , 245 , 3626 (1970). 

The efficiency of reconstituting the original sequence from sub - frag ments~~produced by restriction 
endonuclease treatment will be greatly enhanced by the use of a method for preventing reconstitution in 

w improper sequence. This unwanted result is prevented by treatment of the homogeneous length cDNA 
fragment of desired sequence with an agent capable of removing the 5* - terminal phosphate groups on the 
cDNA prior to cleavage of the homogenous cDNA with a restriction endonuclease. The enzyme alkaline 
phosphatase is preferred. The 5' -terminal phosphate groups are a structural prerequisite for the subse- 
quent joining action of DNA ligase used for reconstituting the cleaved sub -fragments. Therefore, ends 

T5 which lack a 5' -terminal phosphate cannot be covalently joined. The DNA sub -fragments can only be 
joined at the ends containing a 5' -phosphate generated by the restriction endonuclease cleavage 
performed on the isolated DNA fragment. 

The majority of cDNA transcripts, under the conditions described above, are derived from the mRNA 
region containing the 5' -end of the mRNA template by specifically priming on the same template with a 

20 fragment obtained by restriction endonuclease cleavage. In this way, the above - described method may be 
used to obtain not only fragments of specific nucleotide sequence related to a desired protein, but also the 
entire nucleotide sequence coding for the protein of interest. Double - stranded, chemically synthesized 
oligonucleotide linkers, containing the recognition sequence for a restriction endonuclease, may be attached 
to the ends of the isolated cDNA, to facilitate subsequent enzymatic removal of the gene portion from the 

25 vector DNA. See Scheller, R.H., et al, Science 196, 177 (2977). The vector DNA is converted from a 
continuous loop to a linear form by treatment with an appropriate restriction endonuclease. The ends 
thereby formed are treated with alkaline phosphatase to remove 5' -phosphate end groups so that the 
vector DNA may not reform a continuous loop in a DNA ligase reaction without first incorporating a segment 
of the apoaequorin DNA. The cDNA, with attached linker oligonucleotides, and the treated vector DNA are 

30 mixed together with a DNA ligase enzyme, to join the cDNA to the vector DNA, forming a continuous loop 
of recombinant vector DNA, having the cDNA incorporated therein. Where a plasmid vector is used, usually 
the closed loop will be the only form able to transform a bacterium. Transformation, as is understood in the 
art and used herein, is the term used to denote the process whereby a microorganism incorporates 
extracellular DNA into its own genetic constitution. Plasmid DNA in the form of a closed loop may be so 

35 incorporated under appropriate environmental conditions. The incorporated closed loop plasmid undergoes 
replication in the transformed cell, and the replicated copies are distributed to progeny cells when cell 
division occurs. As a result, a new cell line is established, containing the plasmid and carrying the genetic 
determinants thereof. Transformation by a plasmid in this manner, where the plasmid genes are maintained 
in the cell line by plasmid replication, occurs at high frequency when the transforming plasmid DNA is in 

40 closed loop form, and does not or rarely occurs if linear plasmid DNA is used. Once a recombinant vector 
has been made, transformation of a suitble microorganism is a straightforward process, and novel 
microorganism strains containing the apoaequorin gene may readily be isolated, using appropriate selection 
techniques, as understood in the art. 

In summary, genetic information can be obtained from Aequorea jellyfish, converted into cDNA, inserted 

45 into a vector, used to transform a host microorganism, and expressed as apoaequorin in the following 
manner: 

1. Isolate poly(A + )RNA from Aequorea jellyfish. 

2. Synthesize in vitro single -stranded cDNA and then double - stranded cDNA using reverse transcrip- 
tase. — 

so 3. Digest the single - stranded region with S1 nuclease. 

4. Size -fractionate the double - stranded cDNA by gel filtration. 

5. Tail the cDNA using terminal transferase and dCTP. 

6. Digest pBR322 with Pst1 and then tail the linear DNA with terminal transferase and dGTP. 

7. Anneal the dC-tailecTcDNA fragment and dG- tailed pBR322. 

55 8. Transform E. coli SK1592. Select for tetracycline resistant colonies. 

9. Screen the transformants for ampicillin sensitivity. The tet R amp s colonies contain recombinant 
plasmids. Store them at -80 # C. 
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10. Label an oligonucleotide mixed probe (using a sequence deduced from the determined amino acid 
sequence) with radioactivity. 

1 1 . Grow the members of the Aequorea cDNA bank on nitrocellulose filters. Lyse the colonies and fix the 
DNA to the filters. 

5 12. Hybridize the ^P- labelled oligonucleotide mixture to the nitrocellulose filters. The 32 P- probe will 
hybridize to plasmid DNA from those E. coli recombinants which contain the aequorin cDNA sequence. 

13. Wash excess 32 P - probe from the filters. 

14. Expose X-ray film to the filters. 

15. Prepare plasmid DNA from the recombinants identified in the Aequorea cDNA bank. 

70 16. Hybridize the ^P- labelled oligonucleotide to the plasmid DNA (Southern blot) to confirm the 
hybridization. 

17. Demonstrate that these recombinants contain the aequorin DNA sequence by preparing extracts in 
EDTA- containing buffers, pH 7.2. Charge the expressed apoprotein by adding coelenterate luciferin and 
- mercaptoethanol and incubating at 4°C overnight. A flash of blue light is emitted upon the addition of 
75 Ca +2 from samples that express aequorin apoprotein. 

Although the sequence of steps set forth above, when used in combination with the knowledge of those 
skilled in the art of genetic engineering and the previously stated guidelines, will readily enable isolation of 
the desired gene and its use in recombinant DNA vectors, other methods which lead to the same result are 
also known and may be used in the preparation of recombinant DNA vectors of this invention. 
20 Expression of apoaequorin can be enhanced by including multiple copies of the apoaequorin gene in a 
transformed host, by selecting a vector known to reproduce in the host, thereby producing large quantities 
of protein from exogenous inserted DNA (such as pUC8, ptac12, or pIN -HI -ompA1, 2, or 3), or by any 
other known means of enhancing peptide expression. 

In all cases, apoaequorin will be expressed when the DNA sequence is functionally inserted into the 
25 vector. By "functionally inserted" is meant in proper reading frame and orientation, as is well understood by 
those skilled in the art. Typically, an apoaequorin gene will be inserted downstream from a promotor and 
will be followed by a stop codon, although production as a hybrid protein followed by cleavage may be 
used, if desired. 

In addition to the above general procedures which can be used for preparing recombinant DNA 

30 molecules and transformed unicellular organisms in accordance with the practices of this invention, other 
known techniques and modifications thereof can be used in carrying out the practice of the invention. In 
particular, techniques relating to genetic engineering have recently undergone explosive growth and 
development. Many recent U.S. patents disclose plasmids, genetically engineering microorganisms, and 
methods of conducting genetic engineering which can be used in the practice of the present invention. For 

35 example, U.S. Pastent 4,273,875 discloses a plasmid and a process of isolating the same. U.S. Patent 
4,304,863 discloses a process for producing bacteria by genetic engineering in which a hybrid plasmid is 
constructed and used to transform a bacterial host. U.S. Patent 4,419,450 discloses a plasmid useful as a 
cloning vehicle in recombinant DNA work. U.S. Patent 4,362,867 discloses recombinant cDNA construction 
methods and hybrid nucleotides produced thereby which are useful in cloning processes. U.S. Patent 

40 4,403,036 discloses genetic reagents for generating plasmids containing multiple copies of DNA segments. 
U.S. Patent 4,363,877 discloses recombinant DNA transfer vectors. U.S. Patent 4,356,270 discloses a 
recombinant DNA cloning vehicle and is a particularly useful disclosure for those with limited experience in 
the area of genetic engineering since it defines many of the terms used in genetic engineering and the 
basic processes used therein. U.S. Patent 4,336,336 discloses a fused gene and a method of making the 

45 same. U.S. Patent 4,349,629 discloses plasmid vectors and the production and use thereof. U.S. Patent 
4,332,901 discloses a cloning vector useful in recombinant DNA. Although some of these patents are 
directed to the production of a particular gene product that is not within the scope of the present invention, 
the procedures described therein can easily be modified to the practice of the invention described in this 
specification by those skilled in the art of genetic engineering. 

50 All of these patents as well as all other patents and other publications cited in this disclosure are 
indicative of the level of skill of those skilled in the art to which this invention pertains and are all herein 
incorporated by reference. 

The implications of the present invention are significant in that unlimited supplies of apoaequorin will 
become available for use in the development of luminescent immunoassays or in any other type of assay 

55 utilizing aequorin as a marker. Methods of using apoaequorin in a bioluminiscent assay are disclosed in 
Serial Number 541,405, filed October 13, 1983, and commonly assigned, which is herein incorporated by 
reference. Transferring the apoaequorin cDNA which has been isolated to other expression vectors will 
produce constructs which improve the expression of the apoaequorin polypeptide in E. coli or express 
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apoaequorin in other hosts. Furthermore, by using the apoaequorin cDNA or a fragment thereof as a 
hybridization probe, structurally related genes found in other bioluminescent coelenterates and other 
organisms such as squid (Mollusca), fish (Pisces), and Crustacea can be easily cloned. These genes 
include those that code for the luciferases of Renilla , Stylatula , Ptilosarcus , Cavernularia , and Acanthoptilum 
5 in addition to those that code for the photoproteins found in the Hydrozoan Obelia and the ctenophores 
Mnemiopsis and Beroe. 

Particularly contemplated is the isolation of genes from these and related organisms that express 
photoproteins using oligonucleotide probes based on the prinicpal and variant nucleotide sequences 
disclosed herein. Such probes can be considerably shorter than the entire sequence but should be at least 
ro 10, preferably at least 14, nucleotides in length. Longer oligonucleotides are also useful, up to the full length 
of the gene. Both RNA and DNA probes can be used. 

In use, the probes are typically labelled in a detectable manner (e.g., with 32p, 3 H, biotin, or avidin) and 
are incubated with single - stranded DNA or RNA from the organism in which a gene is being sought. 
Hybridization is detected by means of the label after single - stranded and double - stranded (hydridized) 
75 DNA (or DNA/RNA) have been separated (typically using nitrocellulose paper). Hybridization techniques 
suitable for use with oligonucleotides are well known. 

Although probes are normally used with a detectable label that allows easy identification, unlabeled 
oligonucleotides are also useful, both as precursors of labeled probes and for use in methods that provide 
for direct detection of double -stranded DNA (or DNA/RNA). Accordingly, the term "oligonucleotide probe" 
20 refers to both labeled and unlabeled forms. 

Particularly preferred are oligonucleotides obtained from the region coding for amino acids 40 through 
110 of the peptide sequences described herein, since these are the amino acids involved in binding to 
luciferin. 

Coelenterate luciferin is found in and binds to photoproteins from all the organisms listed in Table 5, 
25 and it is contemplated that oligonucleotides as described herein will be useful as probes in isolating 
photoprotein genes from ail these species. 
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Table 5; Distribution of coelenterate-type luciferin 
1. Cnidaria (coelenterates) 4. Pisces 

A. Anthozoa Neoscopelus microchir 5 

Renilla (three sp) a Diaphus 
Stylatula 



Ptilosarcus 
Cavernularia^ 
Acanthoptilum 
3. Hydrozoa 
Aeguorea b 
Obelia 



2. Ctenophora 

Mnemiopsis 
Beroe 

shriirp) 



5. Crustacea 

A. Decapods (shrinp) 
Acantheohyra eximia 
Acanthephyria purpurea 
Oplophorus soinosus a 
Heterocarpus grimaldii 
Heterocarpus laevjgatus a 
Systellaspis cristata 
Systellaspis debilis 

B. Mysidacea (opossum 



Gnathophausia ingeas 



3, Mollusca 

Watasenia (squid) a 



a Structure identical to (1) based on chemical and physical data 
on the extracted luciferins. All others are based on 
luciferin-lucif erase cross reactions as well as on kinetic and 
bioluminescence emission spectra comparisons . 
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5 



Additional evidence that the luciferin is identical to (I) is 
derived frcm chemical and physical data on the isolated 
emitter which has been shown to be identical to II. 



10 



15 



HO 




OH 
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25 



coeleriter ate- type 
luciferin . (1 ) 



coelenter ate- type 
oxyluciferin (II) 



The invention now being generally described, it will be more readily understood by reference to the 
following examples which are included for purposes of illustration only and are not intended to limit the 
30 invention unless so stated. 

EXAMPLE 1 : PURIFICATION OF NATURAL AEQUORIN 

Aequorin was purified according to the method of Blinks et al [Blinks, J. R., P. H. Mattingly, B. R. 
35 Jewell, M. van Leeuwen, G. C. Harrer, and D. G. Allen, Methods Enzymol . 57, 292-328 (1978)] except that 
Sephadex G -75 (superfine) is used in the second gel filtration step. The purification of aequorin took place 
as follows: 

1. Collection of Aequorea in Friday Harbor, Washington, and removal of circumoral tissue (photocytes). 

2. Extraction of proteins from photocytes via hypotonic lysis in EDTA. 
40 3. Ammonium sulfate fractionation of photocyte extract (0-75%). 

4. Centrifugation of (NhU^SO* precipitate; storage at -70°C, during and after shipment from Friday 
Harbor, Washington. 

5. Gel filtration on Sephadex G-50 (fine). 

6. Ion -exchange on QAE Sephadex with pH-step and salt gradient elution. 
45 7. Gel filtration on Sephadex G-75 (superfine). 

8. Ion -exchange on DEAE - Sephadex with pH-step and salt gradient elution. 

9. Lyophilization (in EDTA) of pure aequorin and storage at -80° C. 

Steps 1 - 4 were performed at Friday Harbor. Except for collection and removal of circumoral tissue, all 
steps are done at 0° - 4*C. The final product from Step 4 was stored on dry ice in 250 ml centrifuge 

50 bottles. The material was shipped in this form. 

The purification of aequorin and green fluorescent protein (GFP) was done in Athens, Georgia (Steps 
5-9). All steps were performed at 0-4* C. Aequorin -containing fractions were stored at -80*C between 
steps; aequorin seems to be stabile to freezing and thawing irrespective of protein concentration. 
Step 5 . Gel filtration on Sephadex G-50 (fine). Column dimensions: 5.8 cm x 97 cm; 2563 ml. The column 

55 was run in 10 mM EDTA, pH 5.5 (the disodium salt was used to prepare EDTA solutions) at a flow rate of 
75 ml/hour. The GFP and aequorin eluted together on this column. 65-75% of the aequorin activity was 
pooled for subsequent purification. Side fractions were also pooled and stored for later purification. Aequorin 
yield in this step varied from 50% to 80%; 65-75% yields were usually achieved. The capacity of the 
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column was approximately 1000 mg (Bradford) in 75 ml; generally smaller volumes were loaded whenever 
possible. 

Step 6. Ion -exchange on QAE Sephadex. Column dimensions: 5 cm diameter. 15 grams of dry Sephadex 
were used in this step; the column bed volume changed during chromatography, depending on the ionic 

5 strength and composition of the buffer. 

Generally the pooled material from 6 to 10 initial G -50 steps was run on this column. Overall yield was 
improved by doing this, as was efficiency. This step was performed exactly as described by Blinks et al 
(1978). After the column was loaded, the GFP was selectively eluted with a pH-step (5 mM Na Ac, 5 mM 
EDTA, pH 4.75). Aequorin was then eluted in a linear NaCI gradient in 10 mM EDTA, pH 5.5 (500 ml total 

10 volume). The GFP was made 10 mM in Tris and the pH raised to 8.0 for storage at -80* until further 
purification. The aequorin pool was concentrated via ultrafiltration (Amicon YM-10 membrane) in prepara- 
tion for the next step. Aequorin yield: 80%. 

Step 7 : Gel filtration on Sephadex G-75 (superfine). Column dimensions: 2.8 cm x 150 cm; 924 ml. The 
column was run in 10 mM EDTA, pH 5.5 at 10 ml/hour. Aequorin yield: 60-80%. 
75 Step 8 : Ion -exchange on DEAE- Sephadex. The pooled aequorin from step 7 was run directly onto this 
column, which was run exactly as the QAE Sephadex column. The aequorin yield was generally 75-80%. 
This step is unnecessary with most aequorin preps. The material from step 7 is usually pure, according to 
SDS-PAGE in 12% acrylamide. 

Step 9: Aequorin was lyophilized with > 95% recovery provided that some EDTA was present. Recoveries 
20 varied from 0% to 95% in the absence of EDTA (see Blinks et al., 1978). 

EXAMPLE 2: SEQUENCING METHODOLOGY APPLIED IN THE SEQUENCE DETERMINATION OF 
AEQUORIN 

25 Amino acid sequence analysis was performed using automated Edman Degradation (Edman and Begg, 
1967). The sequence analysis of relatively large amounts of protein or peptide (10 nmol or more) was 
carried out using a Model 890 B Beckman sequencer (Duke University) updated as described by Bhown et. 
al. (1980) and employing a 0.55 M Quadrol program with polybrene (Tarr et. al., 1978). Two peptides, M3 
and M5, which were small or appeared to wash out of the cup with the Quadrol method, were sequenced 

30 using a program adapted for dimethylallylamine buffer and polybrene as suggested by Klapper et al. 
(1978). Phenylthiohydantoin (PTH-) derivatives of amino acids were identified using reverse phase HPLC 
chromatography on a DuPont Zorbay ODS column essentially as described by Hunkapiller and Hood 
(1978). Peptides which were available at the 2- 10 nmol level were sequenced on a Model 890 C Beckman 
sequencer (University of Washington) using a program for use with 0.1 M Quadrol (Brauer et al., 1975) and 

35 polybrene. PTH- amino acids were identified using the reverse phase HPLC system described by Ericsson 
et. al. (1977). An applied Biosystems Model 470 A gas phase sequencer (University of Washington) 
(Hunkapiller et. al., 1983) was used for sequence analysis when there was less than 1.5 nmol of peptide 
available. PTH amino acids from the gas phase instrument were identified using an IBM Cyano column as 
described by Hunkapiller and Hood (1983). 

40 
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EXAMPLE 3: CLONING AND EXPRESSION OF cDNA CODING FOR FOR HOMOGENEOUS 
APOAEQUORIN 

MATERIALS AND METHODS 

5 

Restriction enzymes were purchased from Bethesda Research Laboratories, New England Bio Labs and 
International Biotechnologies, Inc. and used according to conditions described by the supplier. RNasin and 
reverse transcriptase were obtained from Biotech and Life Sciences, respectively. Terminal transferase was 
purchased from PL Biochemicals. Coelenterate luciferin was synthesized as described [Hori, K. Anderson, 
10 J.M., Ward, W.W. and Cormier, M.J. Biochemistry 14, 2371 -2376 (1975); Hori, K. Charbonneau, H., Hart, 
R.C., and Cormier, M.J., Proc. Nat'l. Acad. Sci., USA 74, 4285 - 4287 (1977); Inouye, S. f Sugiura, H., Kakoi, 
H., Hasizuma, K., Goto, T. and lio, H., Chem. Lett. , 141 -144 (1975)] and stored as a lyophilized powder 
until needed. 

75 RNA Isolation and In Vitro Translation 

Aequorea victoria jellyfish were collected at the University of Washington Marine Biology Laboratory at 

Friday Harbor, Washington. The circumoral rings were cut from the circumference of the jellyfish and 

immediately frozen in a dry ice/methanol bath. The tissue was kept at -70° C until needed. 
20 RNA was isolated according to the method of Kim et al [Kim, Y- J. p Shuman, J., Sette, K. and Przybyla, 

A., J. Cell. Biol. 96, 393-400 (1983)] and poly(A) + RNA was prepared using a previously described 

technique [Aviv, H.Tnd Leder, P., PNAS 69 , 1408-1412 (1972)] 

Poly(A + )RNA (1 ug) and poly(A~)RNA (20ug) were translated using the rabbit reticulocyte in vitro 

translation system [Pelham, H.R.B. and Jackson, R.J., Eur. J. Biochem., 247 (1976); Merrick, W.C., in 
25 Methods in Enz . 101(c), 606-615 (1983)]. The lysate was stripped of its endogeneous mRNA with 

micrococcal nuclease. Each translation ( 62 ul total volume) was incubated 90 min at 25 °C in the presence 

of 35 S - methionine (38 uCi). 

Two ul of each translation were removed for analysis by electrophoresis. Apoaequorin was im- 
munoprecipitated by adding antiaequorin (2 ul) and Staph aureus cells to 50 ul of each translation mixtures. 
30 After several washings the antibody - apoaequorin complex was dissociated by heating in the presence of 
SDS. The translated products were analyzed on a SDS polyacryiamide (13%) gel. Following electrophoresis 
the gel was stained with Coomassie R-250 to identify the protein standards and then the gel was 
impregnated with PPO in DMSO. Fluorography was performed at - 70 * C. 

35 Recombinant DNA Procedures 

Double - stranded cDNA was synthesized from total Aequorea poly(A + )RNA as described by Wickens 
et al [Wickens, M.P., Buell, G.N. and Schimke, R.T., J. Biol . Cheni . 253 , 2483-2495 (1978)]. After addition 
of homopolymeric dC tails, double - stranded cDNA was annealed to dG -tailed Pst1 - cut pBR322 [Villa - 

40 Komaroff, L, Efstradiadis, A., Broome, S., Lomedico, P., Tizard, R., Naber, S.PTTChick, W.L., and Gilbert, 
PNAS , 75, 3727-3731 (1978)] and used to transform E. coli strain SK1592. Tetracycline - resistant, 
ampicillin - sensitive colonies were transferred to and frozenTn rnicrotiter dishes at - 70 ° C. 

The Aequorea cDNA library was screened for the aequorin cDNA using a synthetic oligonucleotide 
mixture. The oligonucleotide mixture was supplied by Charles Cantor and Carlos Argarana (Columbia 

45 University). Following their purification by polyacryiamide electrophoresis [Maniatis, T. and Efstratidis, A., 
Meth. in Enz. 65, 299-305 (1980)] the 17-mers were radioactively labelled using polynucleotide kinase 
and 7 - 32 P-ATP [Maxam, A.M. and Gilbert, W., Meth. in Enz. 65, 499-559 (1980)]. The unincorporated 
^P was removed by DEAE- cellulose ion - exchange chromatography. 

The Aequorea cDNA bank was screened in the following manner: The E. coli recombinants were 

so transferred from frozen cultures to nitrocellulose filters (7x1 1cm) placed on Luna aglF plates. The colonies 
were grown 12 hours at 37* C and lysed and then the DNA was fixed as Taub and Thompson [Taub, F. and 
Thompson, E.B., Anal. Biochem. 126 , 222-230 (1982)] described for using Whatman 541 paper. The filters 
were baked under vacuum for 2 hours after they had been air -dried. 

The filters were incubated at 55 C for 12-20 hours in 3 ml/filter of a prehybridization solution (10X 

55 NET, 0.1% SDS, 3X Denhardt's) after first wetting them in 1XSSC. The solution was poured from the 
hybridization bag and replaced with 1 ml/filter of the hybridization solution (10XNET, 0.1% SDS, 3X 
Denhardt's, 1x10* cpm 32 P- labelled 17-mers per filter). The hybridization was carried out for 24 hours at 
37 °C after which the filters were washed four times in 10XSSC at 4* C for 10 min. The filters were air dried 
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and then wrapped in plastic wrap. Kodak XAR-5 film was exposed to the filters at -70° C using a DuPont 
Cronex intensifying screen. 

Growth and Extraction Procedures for E. coli 

E. coli SK1592 containing pAEQl - pAEQ6 were grown overnight in 25ml of Luria broth at 37 °C. The 
cells~were centrifuged and then resuspended in 5ml of 10% sucrose, 50mM Tris pH 8. The cells were lysed 
with the addition of the following: 7.2 ul of 0.1 M phenylmethylsulfonylfluoride, 312 ul of 0.2M EDTA, 10mg 
lysozyme, and 10 ul of 10mg/ml RNase A. After 45 min on ice, the mixture was centrifuged at 43,500 xg for 
one hour. The supernatant was saved. 

Purification and Assay of Aequorin - Aequorin was extracted and purified by the method of Blinks et al 
[Blinks, J.R., Wier, W.G., Hess, P. and Predergast, F.G., Prog. Biophys. Molec. Biol. , 40, 1 -114 (1982)]. 
Aequorin, or photoprotein activity, was measured by injecting 5 ul of the sample into 03ml of 0.1 M CaC*. 
0.1 M Tris, pH 8.0 and simultaneously measuring peak light intensity and total photons. The design of 
photometers for making such measurements and calibrating the instrument for absolute photon yields have 
been previously described [Anderson, J.M., Faini, G.J. and Wampler, J.E., Methods in Enz. 57, 529-559 
(1978)]; Charbonneau, H. and Cormier, M.J., J. Biol. Chem. 254, 769-780 (1979)]. 

Partial Purification of Apoaequorin Activity in pAEQl Extracts" " The expressed apoaequorin was partially 
purified by passage of 23 ml of a pAEQl extract over a 42ml bed volume of Whatman DE-22 equiliblated 
in 1mM EDTA, 1.5mM Tris, pH 7.5. An 800 ml NaCI gradient (0-1M) was applied and the active 
apoaequorin eluted at 0.3M NaCI. The peak fractions were pooled and dialyzed against 0.5M KCI, 10mM 
EDTA and 15mM Tris, pH 7.5 for the experiments described in Fig. 3. 

RESULTS AND DISCUSSION 

In Vitro Translation of Aequorea poly(A + )RNA -Approximately 1.6 ug poly(A + )RNA was isolated from 
each gram of frozen jellyfish tissue. The results of the in vitro translation of the Aequorea poly(A + )RNA are 
shown in Fig. 1. The translation products which reacted^ with anti - aequorin are shown in lane 4. The 35 S 
counts immunoprecipitated represented 0.3% of the total acid - precipitable counts in the translation which 
implies that the apoaequorin mRNA represents approximately 0.3% of the total poly(A + )mRNA populations 
This relative abundance agrees well with the fraction of total protein (0.5%) which corresponds to aequorin 
in a crude extract of circumoral rings from Aequorea. No proteins were imimmunoecipitated when the in 
vitro translation was performed in the absence of Aequorea RNA (lane 2) or in the presence of Aequorea 
poly(A~)RNA (data not shown). ~~ 

The primary translation products immuno - precipitated with anti -aequorin migrated on the SDS- 
PAGE gel with an apparent molecular weight (23,400 daltons, lane 4) slightly greater than that for native 
aequorin isolated from Aequorea (22,800 daltons, indicated in Fig. 1). This data, and the data shown in Fig. 
4, are consistent with the presence of a presequence of approximately seven amino acids in the primary 
translation product. 

The proteins immunoprecipitated from the poly(A + )RNA translation migrated as a doublet or even a 
triplet (lane 4, Fig. 1) if one studies the original autoradiogram. This result can be interpreted in two ways. 
Firstly, multiple apoaequorin genes may exist in Aequorea victoria and their respective preproteins differ in 
molecular weight due to various lengths of their presequences. Aequorin isozymes [Blinks, J.R. and Harrer, 
G.C., Fed. Proc. 34, 474(1975)] may be indicative of such a multi-gene family. Secondly, the Aequorea 
victoria population at Friday Harbor may consist of several species of Aequorea. Identification of Aequorin 
cDNAs 

The Aequorea cDNA library used contained 6000 recombinants having inserts greater than 450 bp. Of 
25 random recombinants screened, none had inserts less than 500 bp and two were larger than 3 kbp. 
The Aequorea cDNA bank was screened with the following mixed synthetic oligonucleotide probe: 



The DNA sequences of these oligonucleotides were determined by an examination of the complete amino 
acid sequence of apoaequorin. These oligonucleotides are complementary to the mRNA which codes for 
the peptide Trp 173 .Tyr.Thr.Met.Asp.Pro 178 in the carboxy terminus - region of the aequorin polypeptide. The 
17-mers were 32 P- labelled and hybridized to plasmid DNA from the Aequorea cDNA library as described 
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in Methods. 

Six transformants were identified which contained plasmids having inserts that hybridized to the 
synthetic oligonucleotides. The restriction map of the plasmid containing the largest Pst I insert, pAEQ1 , is 
shown in Fig. 2. No hybridization of the synthetic oligonucleotides would occur if pAEQ1 was digested with 

5 BamH1. Upon examination of the 17-mers DNA sequence, the hybridization probe does contain a BamH1 
recognition sequence (GGATCC). Hence, the BamH 1 site in pAEQ1 could be used to identify the 3' - region 
of the apoaequorin coding sequence. The recombinant plasmid pAEQl does indeed contain the 
apoaequorin cDNA as demonstrated by its expression in E. coli, as described below. 
Expression of Apoaequorin in E. coli - In order to find~ouTwhether any of these six transformants were 

w expressing biologically active apoaequorin, extracts of each of these, as well as the host strain, were 
prepared as described in Methods. To 0.5ml of each extract was added 0 - mercaptoethanol (2mM) and 
coelenterate luciferin (0.1 mM) and the mixture allowed to incubate at 4* for 20 hours. This mixture was then 
assayed for Ca 2+ - dependent photoprotein activity as described in Methods. Ca 2+ - dependent lumines- 
cence was observed in extracts prepared from the recombinant pAEQl, but no such luminescence was 

75 observed in extracts of the host strain or in extracts derived from any of the other transformants. The inserts 
in pAEQl -6 cross hybridized suggesting that they contain homologous DNA sequences. However, if the 
cDNA inserts in pAEQ2-6 were not of sufficient length or oriented improperly within the plasmid, 
apoaequorin activity in those extracts would not be expected. 

The kinetics of formation of photoprotein activity from extracts of pAEQl is similar to that observed with 

20 native, mixed apoaequorin as shown in Fig. 3. Requirements for the formation of photoprotein activity in this 
extract is also identical to that observed when authentic apoaequorin is used. As Fig. 3 shows, dissolved O2 
is required. Furthermore, the elimination of either /S - mercaptoethanol or coelenterate luciferin from the 
reaction mixture results in zero production of Ca 2+ - dependent photoprotein activity. Injection of the active 
component into Ca 2+ -free buffers produced no luminescence. The subsequent addition of Ca 2+ resulted in 

25 a luminescence flash. 

To further characterize the active component in extracts of the pAEQl - containing transformant, an 
extract of the transformant containing this recombinant plasmid was subjected to chromatography over 
DE-22 as described in Methods. The apoaequorin activity eluted at about 0.3M salt which is similar to that 
observed for authentic apoaequorin. The active fractions were then incubated in the presence of coelen- 

30 terate luciferin, 0 - mercaptoethanol and oxygen to generate photoprotein activity as described in Fig. 3. 
This mixture was then subjected to gel filtration. As Fig. 4 shows, the photoprotein activity generated from 
the partially purified component in pAEQl extracts eluted from the column with an M r of 20,600 as 
compared to a value of 19,600 for native aequorin. Similar results were observed during in vitro translation 
experiments (Fig. 1). From the data of Fig. 4, one may also conclude that the luciferin - becomes tightly 

35 associated with the active component in pAEQl extracts under the charging conditions used. 

The pooled photoprotein fraction from Fig. 4 produces a luminescence flash upon the addition of Ca 2+ . 
The kinetics of this flash was indistinguishable from the kinetics of the Ca 2+ - dependent aequorin reaction. 
Other recombinant plasmids did not express a light - emitting protein when present in transformants, as is 
shown in Table 6 below. 

40 The above data show that the cDNA insert into pAEQl represents the full-length cDNA coding for 
apoaequorin. The data also show that this cDNA is being expressed in pAEQl and that the protein product 
is indistinguishable in its biological properties from that of native, mixed apoaequorin. The level of 
expression was estimated to be about 0.01 % of the total soluble protein. 

45 
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TABLE 6 



Recharging of Apoaequorin in Extracts of Apoaequorin cDNA Clones 


Clone 


Peak Light Intensity (hv sec" 1 ) in Extracts 


+ Ca 2+ 


-Ca 2+ 


pAEQI 


5x10 6 


0 


pAEQI 


0 


0 


pAEQ3 


0 


0 


pAEQ4 


0 


0 


pAEQ5 


0 


0 


pAEQ6 


0 


0 


SK1592 (Host Strain) 


0 


0 



To 0.5 ml of each extract was added mercaptoethanol (2 mM) and coelenterate luciferin (0.015 mM). 
The mixture was incubated at 4* for 20 hours. A 5 ul sample was removed, injected into 0.5 ml of 0.1 mM 
Ca 2+ , and peak light intensity measured. ~ 

EXAMPLE 4: INCREASE OF APOAEQUORIN EXPRESSION IN E. COLI USING THE INDUCIBLE LAC 
PROMOTER 

A relatively low level of apoaequorin is expressed in an E. coli strain containing pAEQI. In order to 
increase the expression level, the apoaequorin gene from pAEQI was subcloned into another plasmid 
(pUC9) in such a manner that transcription of the apoaequorin gene is increased due to its position relative 
to the inducible lac promoter. 

The 0.75 kb Pst I fragment from pAEQI was isolated and cloned into the unique Pst I site of pUC9. Two 
different plasmidsrpAEQ7 and pAEQ8, were obtained. Plasmid pAEQ8 contained the Pst I insert in the 
desired orientaTion shown in FIGURE 2(b). Plasmid pAEQ7 had the fragment insertedln the opposite 
orientation. 

E. coli strains (host: JM105) containing pAEQ7 and pAEQ8 were grown individually at 37* C in 20 ml of 
Luria broth containing ampicillin (50 ug/ml). When the OD 55 o reached 0.6, the culture was made 1 mM in 
100 mM stock solution. A 5 -ml aliquot was removed while the remainder of the culture continued shaking. 
Aliquots were also removed 1 and 2 hours later. Immediately following the removal of each aliquot the cells 
were centrifuged and frozen at -20*C until needed. 

Extracts of the six aliquots were assayed for apoaequorin activity. The E. coli cells were first lysed by 
resuspending the cells in 1 ml of 50 mM Tris, pH 8, 10% sucrose, 25 ug/ml lysozyme, and 20 ug/ml RNase 
A. After 45 minutes on ice, the mixture was centrifuged at 43,500 x g for one hour. The supernatant was 
saved. The supernatants (extracts) were then incubated overnight in the presence of coelenterate luciferin 
and mercaptoethanol (250 ul extract, 3 ul of 5% mercaptoethanol in water, and 3 ul of coelenterate 
luciferin). The aequorin activity in these incubation mixtures was then assayed by injecting 5 ul of the 
sample into 0.5 ml of 0,1 M CaCb, 0.1 Tris, pH 8.0 and simultaneously measuring peak light intensity and 
total photons. Protein was determined using the Bradford assay (Bio - Rad). 

Table 7 contains the results obtained from the six extracts of strains containing pAEQ7 and pAEQ8, an 
extract from a control E. coli strain (SK1592) and one from a strain containing pAEQI. 
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Table 7 



Apoaequorin Expression Levels in E. coli strains containing pAEQI , pAEQ7 and pAEQ8. 


Plasmid 


Hrs. After Induction 


Aequorin 


% of Soluble Protein Represented by 




with IPTG 


Specific Activity 


Aequorin 






(pnotons mg ■) 












Assuming 1 % 


Assuming 20% 








Charging Efficiency 


Charging Efficiency 


pAEQ7 


0 


S0.02 




0 




1 


<0.04 




0 




2 


^0.02 




0 


pAEQ8 


0 


0.668 


0.042 (11)* 


0.002 (10) 




1 


17.2 


1.07 (268) 


0.05 (263) 




2 


38.4 


2.40 (600) 


0.12(600) 


pAEQI 




0.061 


0.004 


0.0002- 


Strain SK1592 




0 




0 



* The numbers in parentheses represent the fold increase over the activity observed in extracts 
containing pAEQI . 



The aequorin activity level in these extracts is dependent upon the charging efficiency with the luciferin. 
25 It is not possible to quantitate the charging efficiency in crude extracts. The observed charging efficiency 
was no higher than a 20% using apoaequorin isolated from Aequorin, and the efficiency is probably 
somewhat lower in the E. coli extracts. Hence, calculations based on 1 % and 20% charging efficiency are 
included. 

The expression from pAEQ8 is significantly higher two hours post induction (600 - fold) when compared 
30 to expression from pAEQI. No inducation of the aequorin gene is observed from pAEQ7 which contains the 
aequorin gene in the opposite orientation as in pAEQ8. 

This data demonstrates that the apoaequorin expression level can be significantly increased if the cDNA 
nucleotide sequence is positioned appropriately 3' to an inducible promoter. 

35 EXAMPLE 5: IDENTIFICATION OF THE STRUCTURE OF A SPECIFIC APOAEQUORIN GENE 

The structure of the Pst I - Pst I fragment shown in FIGURE 2 and present in plasmids pAEQI , pAEQ7 
and pAEQ8 was determined using standard techniques of gene sequence analysis. Beginning at the 3' end 
there are 69 nucleotides that are untranslated. Expression of the translated region results in the expression 

40 of a protein containing seven additional amino acid residues on the N -terminal end as compared to the 
protein (apoaequorin) isolated in the manner described in Example 1 , which begins with VAL as the N - 
terminal. The seven residues appear to be nicked off by a protease during the isolation procedure of 
Example 1. The C- terminal codon, which codes for PRO, is followed by a stop codon and twelve additional 
nucleotides, clearly indicating that the isolated cDNA is full-length. The entire double - stranded DNA and 

45 amino -acid sequences are shown below: 



50 
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CTT TGC ADC AAA ACA OCA CAT CAA ATC ICC ACT TGA TAA ACT AAA TCG TOC CAA CGG CAA 
GAA ACG TGG TTT TCT GCT GTA GTT TAG AQG TCA ACT ATT TGA TIT AGC AGG GTT GCC GTT 

5 

1 

-22 

?Q met TOR SER GLU GLN TYR SER VAL LY5 LEU TOR PRO ASP PHE ASP ASN PRO 

CAG GOC hPC ATG AOC AGC GAA CAA TAC TCA GVC AAG CTT ACA OCA GAC TTC GAC AAC CCA 
GTC CGG TTG TAC TGG TCG CTT GTT ATG ACT CAG TTC GAA TGT GOT CTG AAG CTG TTG GCT 

75 6i 
-2 

20 LYS TRY ILE GLY ARG HIS LYS HIS MET PHE ASN MS LEU ASP VAL ASN HIS ASN GLY ARG 
AAA TGG ATT GGA CGA CAC AAG CAC ATG TTT AAT TTT CTT GAT GTC AAC CAC AAT GGA AGG 
TTT AOC TAA OCT GCT GTG TTC GTG TAC AAA TEA AAA GAA CIA CAG TTG GTG TTA OCT TOC 

25 

121 
18 

ILE SER IEU ASP GLU MET VAL TYR LYS ALA SER ASP ILE VAL ILE ASN ASN LEU GLY ALA 

30 

ATC TCT CTT G2C GAG ATG GTC TAC AAG GCG TOC GAT ATT GTT ATA AAC AAT CTT GGA GCA 
TAG AGA GAA CTG CIC TAC CAG ATG TTC CGC AGG CIA TAA CAA TAT TTG TEA GAA OCT CCT 
35 181 
38 

THR PRO GLU GLN ALA LYS ARG HIS LYS ASP ALA VAL GLU ALA WE FHE GLY GLY ALA GLY 
40 ACA OCT GAA CAA GOC AAA CCT CAC AAA GAT GCT GTA GAA GOC TTC TIC GGA GGA GCT GGA 
TGT GGA CTT GIT CGG TTT OCA GTG TTT CEA CGA CAT CTT CGG AAG AAG CCT OCT CGA OCT 

45 
50 
55 
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241 
58 



5 


MET UTS 


TYR OX VAL GLU THR GLU 


TRY PRO 


<XU 


TYR 


ILE 


OJU 


GLY 


TRY 


LYS 


ARG 


LEU 


ALA 




ATG AAA 


TAT GOT (TEA GAA ACT GAA 


TGG OCT 


GAA 


TAC 


ATC 


GAA 


GGA 


TGG 


AAA 


AGA 


CTG 


GCT 




TAG TIT 


ATA OCA CAT CTT TGA CTT 


ACC GGA 


err 


ATG 


TAG 


CTT 


OCT 


ACC 


TTT 


TCT 


GAC 


CGA 


10 


301 
78 




























SER GLU 


GLU LEU LYS ARG TYR SER 


LYS ASN 


GLN 


ILE 


THR 


LEU 


ILE 


ARG 


LEU 


TRY 


GLY 


ASP 


15 


TCC GAG 


GAA TTG AAA AGG TAT TCA 


AAA AAC 


CAA 


ATC 


ACA 


CTT 


ATT 


CGT 


TTA 


TGG 


GGT 


GAT 




AGG CIC 


CTT AAC TIT TCC ATA ACT 


TTT TTG 


GIT 


TAG 


TGT 


GAA 


TAA 


GCA 


AAT 


ACC 


OCA 


CIA 




361 


























20 


98 




























ALA LEU 


PHE ASP ILE H£ ASP LYS 


ASP GLN 


ASN 


GLY 


ALA 


ILE 


SER 


LEU 


ASP 


GLU 


TOY 


LYS 




GCA TTG 


TIC GAT ATC ATT GAC AAA 


GAC CAA 


AAT 


GGA 


GCT 


ATT 


TCA 


CTG 


GAT 


GAA 


TGG 


AAA 


25 


CGT AAC 
421 
118 


AAG CIA TAG TAA CTG TIT 


CTG GIT 


TTA 


OCT 


OGA 


TAA 


AGT 


GAC 


CIA 


CTT 


ACC 


TTT 


30 


ALA TYR 


THR LYS SER ALA GLY ILE 


ILE GLN 


SER 


SER 


GLU 


ASP 


CYS 


GLU 


GLU 


THR 


WE 


ARG 




GCA TPC 


AOC AAA TCT GCT GGC ATC 


ATC CAA 


TOG 


TCA 


G^A 


GAT 


TGC 


GAG 


GAA 


ACA 


TIC 


AGA 




CGT ATG 


TGG TTT AGA CGA CCG TAG 


TAG GIT 


AGO 


ACT 


CTT 


CTA 


ACG 


CIC 


CTT 


TGT 


AAG 


TCT 


35 


481 
138 




























VAL as 


ASP ILE ASP GLU SER GLY 


GLN LEU 


ASP 


VAL 


ASP 


GLU 


MET 


THR 


ARG 


GLN 


HIS 


LEU 


40 


GIG TOC 


GAT ATT GAT GAA ACT GGA 


CAG CIC 


GAT 


GIT 


GAT 


GAG 


ATG 


ACA 


AGA 


CAA 


CAT 


TTA 




CAC ACG 


CTA TAA CIA CTT TCA OCT 


GTC GAG 


CIA 


CAA 


CIA 


CIC 


TAC 


TUT 


TCT 


GIT 


GTA 


AAT 



45 
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541 
158 

5 GLY ME TOY TYR THR MET ASP PRO ALA CYS GLU LY5 LEU TYR Q,Y GLY AIA VAL PRO 

GGA TTT TGG TAC AOC ATG GAT OCT OCT TGC GAA AAG CTC TAC GGT GGA OCT GTC COC TAA 
OCT AAA AOC ATG TGG TAC CIA GGA OGA AOG CTT TIC GAG ATG OCA OCT OGA CAG GGG ATT 

70 601 
178 



GAA ACT CTG OGC 
CTT TGA GftC GCG 
661 

20 

198 

This Identified sequence illustrates that selection of any amino acid sequence from the variations shown 

25 in Table 3 will produce an active apoaequorin since the sequence shown above contains both principal and 
minor amino acids at different positions of microheterogeneity. 

Accordingly, the different genes that code for the various peptides are likewise shown to represent 
biologically active molecules. 

The previous discussion in this application relating to other peptides, DNA molecules, vectors, etc., 

30 based on the general sequences disclosed in this application and representing minor variations thereof 
(thereby having the biological activity of apoaequorin) is equally applicable to the specific sequences 
disclosed in this Example. This Example discloses a peptide, a DNA sequence coding for the peptide, and 
a full DNA sequence between two restriction sites, among others. The coding DNA sequence (optionally 
including a terminal stop codon) is disclosed as a separate entity from the full DNA sequence and is to be 

35 considered separately therefrom, including its variations. Any DNA disclosed as double - stranded DNA also 
is considered to disclose the individual single -stranded DNA forming the same as well as RNA equivalent 
thereto by a transcription process. 

The present invention is not limited to the specific vectors and host bacteria disclosed in these 
Examples, although the vectors and hosts of the Examples are used in some of the preferred embodiments 

40 of the invention. All of the bacteria and plasm ids used are readily available to those of ordinary skill in 
biotechnology through various sources. For example, E. coli JM105 and plasmids pUC9 and pBR322 are 
commercially available from P.L. Biochemicals, lnc.,~800~~Centennial Avenue, Piscataway, New Jersey, 
08854 (a division of Pharmacia, Inc.), where they are identified by Catalogue Nos. 27-1550 - 01, 27- 
4918-01, and 27-1750-01, respectively. Furthermore, these and other microorganisms and plasmids that 

45 can be used in the practice of the invention are also available from the American Type Culture Collection, 
12301 Parklawn Drive, Rockville, Maryland, 20852. Representative examples and their deposit numbers are 
E. coli SK1592, ATCC No. 35106; E. coli JM105, ATCC No. 53029; pUC9, ATCC No. 37252; and pBR322, 
ATCC No. 31344. 

The invention now being fully described, it will be apparent to one of ordinary skill in the art that many 
so changes and modifications can be made thereto without departing from the spirit or scope of the invention 
as set forth herein. 

Claims 

55 1. A homogeneous peptide having the sequence 
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5 



70 























VAL 


LYS 


LEU 


TOR 


PRO 


ASP 


PHE 


ASP ASN PRO 


V£S 


TRY 


II£ 


GLY 


ARG 


HIS 


LYS 


HIS 


MET 


PHE 


ASN 


PHE 


I£U 


ASP 


VAL 


ASN 


HIS 


ASN GLY ARG 


ILE 


SER 


LEU 


ASP 


GLU 


MET 


VAL 


TOR 


IYS 


AIA 


SER 


ASP 


he 


VAL 


HZ 


ASN 


ASH 


LEU GLY ALA 


THR 


PRO 


GLU 


GEN 


AIA 


LYS 


ARG 


HIS 


LYS 


ASP 


AIA 


VAL 


GLU 


AIA 


PHE 


PHE 


GLY 


GLY ALA GLY 


MET 


LYS 


TOR 


GLY 


VAL 


GLU 


TOR 


GLU 


TRY 


PRO 


GLU 


TOR 


TTP 


GLU 


GLY 


TRY 


LYS 


ARG LEU AIA 


SER 


GUJ 


GUJ 


LEU 


LYS 


ARG 


TOR 


SER 


LYS 


ASN 


GIN 


II£ 


TOR 


LEU 


ILE 


ARG 


LEU 


TRY GLY ASP 


AIA 


LHJ 


PHE 


ASP 


ILE 


ILE 


ASP 


LYS 


ASP 


GUI 


ASN 


GLY 


AIA 




SER 


LEU 


ASP 


GLU TRY LYS 


AIA 


TYR 


TOR 


LYS 


SER 


AIA 


GLY 


ILE 


ILE 


GIN 


SER 


SER 


GLU 


ASP 


GYS 


GLU 


GLU 


TOR PHE ARG 


VAL 


CVS 


ASP 


HJ3 


ASP 


GLU 


SER 


GLY 


GIN 


LEU 


ASP 


VAL 


ASP 


GLU 


MET 


TOR 


ARG 


GIN HIS LEU 


GLY 


PHE 


TRY 


TYR 


TOR 


MET 


ASP 


PRO 


AIA 


CYS 


GLU 


LYS 


usu 


TOR- 


GLY 


GLY 


AIA 


VAL pro; 



in which sequence at least one of the following amino acids is replaced as follows: 

PRO 5 is replaced by SER, 

ASP 8 is replaced by ASN, 
25 LYS 11 is replaced by ARG, 

GLU 78 is replaced by ASP, 

GLU 81 is replaced by ALA, 

ARG 88 is replaced by LYS, 

GLU 92 is replaced by ASP or CYS, 
30 ARG 96 is replaced by LYS, 

SER 98 is replaced by ALA, 

GLN 101 is replaced by GLU, 

ILE 102 is replaced by PRO, 

LEU 107 is replaced by ILE, 
35 ILE 116 is replaced by VAL, 

SER 135 is replaced by ALA, 

SER U1 is replaced by THR, 

said peptide being capable of binding coelenterate luciferin and emitting light in the presence of Ca 2+ . 

40 2. A peptide as claimed in claim 1 wherein from 1 to 15 amino acids are absent from either the amino 
terminal, the carboxy terminal, or both terminals. 

3. A peptide as claimed in claim 2 wherein said absent amino acids are absent from the amino terminal. 

45 4. A peptide as claimed in claim 2 wherein said absent amino acids are absent from the carboxy terminal. 

5. A peptide as claimed in claim 1 wherein from 1 to 10 additional amino acids are attached sequentially 
to the amino terminal, carboxy terminal or both terminals. 

50 6. A peptide as claimed in claim 5 wherein said additional amino acids are attached to the amino acid 
terminal. 

7. A peptide as claimed in claim 5 wherein said additional amino acids are attached to the carboxy 
terminal. 

55 

8. A substantially pure polynucleotide molecule which comprises the nucleotide sequence 
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ATG AOC AGC GAA CAA TAC TCA 


QIC 


AAG 


CTT ACA 


OCA 


GAC TIC 


GAC AAC 


CCA 


AAA TGG ATT 


GGA CGA CAC AAG CAC ATG TIT 


AAT 


TIT 


CTT <&T 


GTC 


AAC GAC 


AAT GGA 


AGG 


ATC TCT CTT 


GAC GAG ATG GTC TPC AAG GCG 


TOC 


GAT 


ATT GIT 


ATA 


AAC AAT 


CTT GGA 


GCA 


nt-H v*A-l UftA 


CAA GOC AAA CGT CAC AAA GAT 


gct 


GTA 


GAA, GOC 


TIC 


TTC GGA 


GGA GCT 


GGA 


B*P^ frnin 


GOT GTA GAA ACT GAA TGG OCT 


GAA 


TAC 


ATC GAA 


GGA 


TGG AAA 


AGA CTG 


GCT 


TOC GAG GAA 


TTG AAA AGG TAT TCA AAA AAC 


CAA 


ATC 


ACA CTT 


ATT 


CGT TEA 


TGG GCT 


GAT 


GCA TTG TTC 


GAT ATC ATT GAC AAA GAC CAA 


AAT 


GGA 


GCT ATT 


TCA 


CTG GAT 


GAA TGG 


AAA 


GCA TAG ACC 


AAA TCT OCT GGC ATC ATC CAA 


TCG 


TCA 


GAA GAT 


TOC 


GAG GAA 


ACA TIC 


AGA 


GIG TOC GAT 


ATT GAT GAA AGT GGA CAG CTC 


GAT 


GTT 


GAT GAG 


ATG 


ACA AGA 


CAA CAT 


TTA 


GGA TIT TGG 


TAG ACC ATG GAT OCT GCT TOC 


GAA 


AAG 


CIC TAC 


OCT 


GGA GCT 


GTC OOC. 





in which sequence at least one of the following codons is replaced as follows : 

CCA 12 is replaced by a codon encoding SER, 

GAC 15 is replaced by a codon encoding ASN, 
25 AAA 18 is replaced by a codon encoding ARG, 

GAA 85 is replaced by a codon encoding ASP, 

GAA 88 is replaced by a codon encoding ALA, 

AGA 95 is replaced by a codon encoding LYS, 

GAG" is replaced by a codon encoding ASP or CYS, 
30 AGG 103 is replaced by a codon encoding LYS, 

TCA 105 is replaced by a codon encoding ALA, 

CAA 108 is replaced by a codon encoding GLU, 

ATC 109 is replaced by a codon encoding PRO, 

TTA 1H is replaced by a codon encoding ILE, 
35 ATT 123 is replaced by a codon encoding VAL, 

TCT 142 is replaced by a codon encoding ALA, 

TCG 148 is replaced by a codon encoding THR, 

said sequence encoding a peptide capable of binding coelenterate luciferin and emitting light in the 
presence of Ca 2+ . 

40 

9. A polynucleotide molecule as claimed in claim 8 wherein from the 3' terminal from 8 to 22 codons, 
encoding from 8 to 22 amino acids, and from the 5' terminal from 1 to 15 codons, encoding from 1 to 
15 amino acids, are absent. 



45 10. A polynucleotide molecule as claimed in claim 8 wherein from the 3' terminal from 8 to 22 codons, 
encoding from 8 to 22 amino acids, are absent. 

11. A polynucleotide molecule as claimed in claim 8 wherein from the 5' terminal from 1 to 15 codons, 
encoding from 1 to 15 amino acids, are absent. 

50 

12. A polynucleotide molecule as claimed in claim 8 wherein from 1 to 3 codons, encoding from 1 to 3 
amino acids, are sequentially added to the 3' terminal and from 1 to 10 codons, encoding from 1 to 10 
amino acids, are sequentially added to the 5' terminal. 

55 13. A polynucleotide molecule as claimed in claim 8 wherein from 1 to 3 codons, encoding from 1 to 3 
amino acids, are sequentially added to the 3* terminal. 
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14. A polynucleotide molecule as claimed in claim 8 wherein from 1 to 10 codons, encoding from 1 to 10 
amino acids, are sequentially added to the 5' terminal. 

15. A polynucleotide molecule as claimed in any one of claims 8 to 14 wherein said molecule is single - 
5 stranded DNA. 

16. A polynucleotide molecule as claimed in any one of claims 8 to 14 wherein said molecule is double - 
stranded DNA. 

10 17. A recombinant DNA vector wherein said vector is capable of reproducing in a microorganism and said 
vector comprises a nucleotide sequence as claimed in any one of claims 8 to 14. 

18. A recombinant vector as claimed in claim 17 wherein said nucleotide sequence is preceded by a lac 
promotor. 

75 

19. A genetically engineered microorganism comprising a recombinant vector as claimed in claims 17 or 
18. 

20. A process for producing a peptide as claimed in any one of claims 1 to 7 which process comprises 
20 either expressing a vector encoding such a peptide in a host microorganism or chemically synthesizing 

said peptide. 

Claims for the following Contracting State : AT 
25 1. A process for producing a homogenous peptide having the sequence 

VAL LYS LEU THR PRO ASP PHE ASP ASN PRO 
LYS TRY IIE GLY ARG HIS LYS HIS MET PHE ASN PHE LEU ASP VAL ASN HIS ASN GLY ARG 

30 

ILE SER LEU ASP GLU MET VAL TOR LYS ALA SER ASP ILE VAL HE ASN ASN IEU GLY ALA 
THR PRO GLU GLN ALA LYS ARG HIS LYS ASP AIA VAL GLU AIA PHE PHE GLY GLY ALA GLY 

35 MET LYS TOR GLY VAL GLU TOR GLU TRY PRO GLU TOR. ILE GLU GLY TRY LYS ARG LEU ALA 
SER GLU GU3 LEU LYS ARG TOR SER LYS ASN GIN ILE THR LEU ILE ARC LEU TRY GLY ASP 
AIA LEU PHE ASP ILE HE ASP LYS ASP GLN ASN GLY AIA xifi SER LEU ASP GLU TRY LYS 

40 AIA TOR THR LYS SER ALA GLY ILE ILE GIN SER SER GLU ASP CYS GLU GLU THR PHE ARG 
VAL CYS ASP ILE ASP GLU SER GSf GLN LEU ASP VAL ASP GLU MET THR ARG GLN HIS LEU 
GLY PHE TRY TOR TOR MET ASP PRO AIA CYS GLU LYS LEU TOR GLY GLY AIA VAL PROj 

45 

in which sequence at least one of the following amino acids is replaced as follows: 

PRO 5 is replaced by SER, 

ASP 8 is replaced by ASN, 
50 LYS 11 is replaced by ARG, 

GLU 78 is replaced by ASP, 

GLU 81 is replaced by ALA, 

ARG 88 is replaced by LYS, 

GLU 92 is replaced by ASP or CYS, 
55 ARG 96 is replaced by LYS, 

SER 98 is replaced by ALA, 

GLN 101 is replaced by GLU, 

ILE 102 is replaced by PRO, 
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LEU 107 is replaced by ILE, 
ILE 116 is replaced by VAL, 
SER 135 is replaced by ALA, 
SER U1 is replaced by THR, 

5 said peptide being capable of binding coelenterate luciferin and emitting light in the presence of Ca 2+ , 
which process comprises either expressing a vector comprising a DNA sequence encoding such a 
peptide in a host organism or chemically synthesizing said peptide sequence. 

2. A process as claimed in claim 1 wherein from 1 to 15 amino acids are absent from either the amino 
10 terminal, the carboxy terminal, or both terminals. 

3. A process as claimed in claim 2 wherein said absent amino acids are absent from the amino terminal. 

4. A process as claimed in claim 2 wherein said absent amino acids are absent from the carboxy terminal. 

75 

5. A process as claimed in claim 1 wherein from 1 to 10 additional amino acids are attached sequentially 
to the amino terminal, carboxy terminal or both terminals. 

6. A process as claimed in claim 5 wherein said additional amino acids are attached to the amino acid 
20 terminal. 

7. A process as claimed in claim 5 wherein said additional amino acids are attached to the carboxy 
terminal. 

25 8. A process for producing a substantially pure polynucleotide sequence 
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ATG 


ACC 


AGC 


CAA 


CAA TAC TCA 


GIC AAG 


err aca 


OCA GAC 


TTC GAC AAC 


CCA 


AAA. 


TGG 


ATT GGA 


CGA 


CAC 


AAG 


CAC ATG TTT 


AAT TTT 


err gat 


GTC AAC 


CAC AAT GGA 


AGG 


ATC 


TCT 


err gac 


GAG 


ATG 


GIC 


TAC AAG GCG 


TCC GAT 


ATT OTP 


ATA AAC 


aat err gga 


GCA 


ACA 


OCT 


GAA CAA 


GCC 


AAA 


CGT 


CAC AAA GAT 


GOT GTA 


GAA.GOC 


TTC TTC 


GGA GGA OCT 


GGA 


ATC 


AAA 


TAT GGT 


GTA 


GAA 


ACT 


GAA TGG OCT 


GAA TAC 


ATC GAA 


GGA TGG 


AAA AGA CTG 


GCT 


TOC 


GAG 


GAA TTG 


AAA 


AGG 


TAT 


TCA AAA AAC 


CAA ATC 


aca err 


ATT CGT 


TEA TGG GGT 


GAT 


QCA 


TTG 


TTC GAT 


ATC 


ATT 


GAC 


AAA (AC CAA 


AAT GGA 


GCT ATT 


TCA CTG 


GAT GAA TGG 


AAA 


GCA 


TAG 


AOC AAA 


TCT 


GCT 


GGC 


ATC ATC CAA 


TCG TCA 


GAA GAT 


TOC GAG 


CAA ACA TTC 


AGA 


GTG 


TGC 


GAT ATT 


GAT 


GAA 


ACT 


GGA CAG CIC 


GAT GTT 


GAT GAG 


ATG ACA 


AGA CAA CAT 


TTA 


GGA 


TTT 


TGG TAC 


AOC 


ATG 


GAT 


OCT GCT TGC 


GAA AAG 


CIC TAC 


GOT GGA 


GCT GIC OCC. 



50 

in which sequence at least one of the following codons is replaced as follows : 
CCA 12 is replaced by a codon encoding SER, 
GAC 15 is replaced by a codon encoding ASN, 
AAA 18 is replaced by a codon encoding ARG, 
55 GAA 85 is replaced by a codon encoding ASP, 
GAA 88 is replaced by a codon encoding ALA, 
AGA 95 is replaced by a codon encoding LYS, 
GAG" is replaced by a codon encoding ASP or CYS, 
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AGG 103 is replaced by a codon encoding LYS, 
TCA 105 is replaced by a codon encoding ALA, 
CAA 108 is replaced by a codon encoding GLU, 
ATC 109 is replaced by a codon encoding PRO, 
5 TTA 1H is replaced by a codon encoding ILE, 
ATT 123 is replaced by a codon encoding VAL, 
TCT 142 is replaced by a codon encoding ALA, 
TCG U8 is replaced by a codon encoding THR, 

said sequence encoding a peptide capable of binding coelenterate luciferin and emitting light in the 
w presence of Ca 2+ , which process comprises sythesizing said sequence using chemical techniques or 
isolating said sequence via a nucleotide library derived from natural sources. 

9. A process as claimed in claim 8 wherein from the 3' terminal from 8 to 22 codons, encoding from 8 to 
22 amino acids, and from the 5' terminal from 1 to 15 codons, encoding from 1 to 15 amino acids, are 

75 absent. 

10. A process as claimed in claim 8 wherein from the 3' terminal from 8 to 22 codons, encoding from 8 to 
22 amino acids, are absent. 

20 11. A process as claimed in claim 8 wherein from the 5' terminal from 1 to 15 codons, encoding from 1 to 
15 amino acids, are absent. 

12. A process as claimed in claim 8 wherein from 1 to 3 codons, encoding from 1 to 3 amino acids, are 
sequentially added to the 3' terminal and from 1 to 10 codons, encoding from 1 to 10 amino acids, are 

25 sequentially added to the 5 1 terminal. 

13. A process as claimed in claim 8 wherein from 1 to 3 codons, encoding from 1 to 3 amino acids, are 
sequentially added to the 3 1 terminal. 

30 14. A process as claimed in claim 8 wherein from 1 to 10 codons, encoding from 1 to 10 amino acids, are 
sequentially added to the 5' terminal. 

15. A process as claimed in any one of claims 8 to 14 wherein said sequence is single - stranded DNA. 

35 16. A process as claimed in any one of claims 8 to 14 wherein said sequence is double - stranded DNA. 

17. A process for producing a recombinant DNA vector which process comprises incorporating in a vector 
capable of reproducing in a microorganism a nucleotide sequence produced in accordance with any 
one of claims 8 to 16. 

40 

18. A process as claimed in claim 17 wherein said nucleotide sequence is preceded by a lac promotor. 

19. A process for producing a genetically engineered microorganism which process comprises introducing 
into the genotype of a microorganism a vector produced in accordance with claims 17 or 18. 

45 
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